1개 답변
- 최신
- 최다 투표
- 가장 많은 댓글
2
The question isn't clear the way it is worded.
Let me clarify based on what I understand. If your question is how to include pyspark libraries in a AWS Glue Python Shell job, this cannot be done as the computing for this option does not involve drivers and operators that a Spark engine would need. This uses a single compute resource.
For AWS Glye Pyspark job, you import the libraries as usual. WHen you create a job from the console, it should provide you the default imports needed to start.
답변함 일 년 전
관련 콘텐츠
- AWS 공식업데이트됨 일 년 전
- AWS 공식업데이트됨 2년 전
I would just add that for Glue ETL jobs (PySpark) you can find the info on how to add additional libraries here: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-libraries.html
for Glue Python Shell jobs , you can add python libraries (not spark) and the method to do so is found here: https://docs.aws.amazon.com/glue/latest/dg/add-job-python.html#create-python-extra-library