Glue PySpark vs Shell

0

I zipped my modules into zip file, uploaded to s3 and added to Pyspark and Shell jobs under Python library path parameter:

Enter image description here

In both jobs I am using the same import syntax. Pyspark job is working, Python Shell jobs raises error, saying my module not found:

ModuleNotFoundError: No module named 'expect_multicolumn_values_not_null'

Is there a difference between Pyspark and Python Shell modules import? How can I make it work on both?

asked 2 months ago257 views
1 Answer
0

Hi,

Based on my understanding, for the Python Shell jobs, you can consider using this approach Providing your own Python library. For your use-case you may need to create an Egg or Whl file.

You can also refer to these posts How do I use external Python libraries in my AWS Glue 2.0 ETL job? and External python libraries in a AWS Glue python shell job

Thanks, Rama

profile pictureAWS
EXPERT
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions