AWS Glue - Read a 'local' file in Python

0

In AWS Glue I use a legacy Python package that reads a constant json file from the same package. To simplify things, let's say testLib package has a test-lib.py and data.json files. test-lib.py has a function:

def test():
    f = open('data.json')

I uploaded testLib.zip to S3 and use it in the AWS Glue job:

from test_lib import test

test()

The AWS Glue job fails with the error: FileNotFoundError: [Errno 2] No such file or directory: 'data.json'
There are many questions about how to manage files on S3 in AWS Glue. But it requires changing the legacy package that I don't want to do.
Is there any way to configure AWS Glue job to allow open 'local' files?

Alex
gefragt vor 2 Jahren3233 Aufrufe
1 Antwort
1

For a pythonshell job, dependency should be packaged as .egg or .whl file.

To refer to a file or .zip file in S3 from a Glue ETL pythonshell job, add these file path under 'Referenced files path' (ex: s3://bucketName/foldername/depFile1.txt, s3://bucketName/foldername/depFile2.txt) or 'Python library path'

These files are available in a directory under '/temp'. Directory has naming structure like 'glue-python-libs*'

These files can be referred in ETL job using following code snippet:

import sys, os
dirs = os.listdir('.')

gluelib = list(filter(lambda c: c.startswith('glue-python-libs-'),dirs))

## to print files/directories added in job configuration
print(os.listdir(gluelib[0])) 

## to read a file added to job configuration 'Referenced files path'
with open('/tmp/'+gluelib[0]+'/depFile1.txt', "r") as f:
    lines = f.readlines()
    print(lines)
AWS
beantwortet vor 2 Jahren

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen