1 Answer
- Newest
- Most votes
- Most comments
2
The question isn't clear the way it is worded.
Let me clarify based on what I understand. If your question is how to include pyspark libraries in a AWS Glue Python Shell job, this cannot be done as the computing for this option does not involve drivers and operators that a Spark engine would need. This uses a single compute resource.
For AWS Glye Pyspark job, you import the libraries as usual. WHen you create a job from the console, it should provide you the default imports needed to start.
answered a year ago
Relevant content
- asked a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
I would just add that for Glue ETL jobs (PySpark) you can find the info on how to add additional libraries here: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-libraries.html
for Glue Python Shell jobs , you can add python libraries (not spark) and the method to do so is found here: https://docs.aws.amazon.com/glue/latest/dg/add-job-python.html#create-python-extra-library