2 Antworten
- Neueste
- Die meisten Stimmen
- Die meisten Kommentare
2
Pandas API on Spark was introduced in Spark 3.2. You'll need to execute your job using AWS Glue 4.0, which comes with Spark 3.3.0.
Steps:
- Create a new Job using Spark script editor -- Note that AWS Glue interactive sessions is not yet available for AWS Glue 4.0
- Select "Create a new script with boilerplate code" (default)
- After the line
job.init()
, write your code. For example:
import pyspark.pandas as ps import numpy as np s = ps.Series([1, 3, 5, np.nan, 6, 8])
beantwortet vor einem Jahr
0
Take a look at this page : https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-libraries.html#glue20-modules-provided
If you have a custom .py file which you are putting into a zip and uploading to Glue Job script locations , be sure to have the right path .
beantwortet vor einem Jahr
Relevanter Inhalt
- AWS OFFICIALAktualisiert vor 2 Jahren
- AWS OFFICIALAktualisiert vor einem Jahr
- AWS OFFICIALAktualisiert vor einem Jahr
- AWS OFFICIALAktualisiert vor 2 Jahren