1 Respuesta
- Más nuevo
- Más votos
- Más comentarios
2
The question isn't clear the way it is worded.
Let me clarify based on what I understand. If your question is how to include pyspark libraries in a AWS Glue Python Shell job, this cannot be done as the computing for this option does not involve drivers and operators that a Spark engine would need. This uses a single compute resource.
For AWS Glye Pyspark job, you import the libraries as usual. WHen you create a job from the console, it should provide you the default imports needed to start.
respondido hace un año
Contenido relevante
- OFICIAL DE AWSActualizada hace un año
- OFICIAL DE AWSActualizada hace 2 años
- OFICIAL DE AWSActualizada hace 2 años
- OFICIAL DE AWSActualizada hace 2 años
I would just add that for Glue ETL jobs (PySpark) you can find the info on how to add additional libraries here: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-libraries.html
for Glue Python Shell jobs , you can add python libraries (not spark) and the method to do so is found here: https://docs.aws.amazon.com/glue/latest/dg/add-job-python.html#create-python-extra-library