1 Answer
- Newest
- Most votes
- Most comments
1
For a pythonshell job, dependency should be packaged as .egg or .whl file.
To refer to a file or .zip file in S3 from a Glue ETL pythonshell job, add these file path under 'Referenced files path' (ex: s3://bucketName/foldername/depFile1.txt, s3://bucketName/foldername/depFile2.txt) or 'Python library path'
These files are available in a directory under '/temp'. Directory has naming structure like 'glue-python-libs*'
These files can be referred in ETL job using following code snippet:
import sys, os
dirs = os.listdir('.')
gluelib = list(filter(lambda c: c.startswith('glue-python-libs-'),dirs))
## to print files/directories added in job configuration
print(os.listdir(gluelib[0]))
## to read a file added to job configuration 'Referenced files path'
with open('/tmp/'+gluelib[0]+'/depFile1.txt', "r") as f:
lines = f.readlines()
print(lines)
answered 2 years ago
Relevant content
- asked 6 months ago
- asked 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 9 months ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago