- Mais recentes
- Mais votos
- Mais comentários
You can pass this in your AWS Glue Scripts. See if this helps
import sys
from awsglue.utils import getResolvedOptions
args = getResolvedOptions(sys.argv, ['JOB_NAME'])
job_run_id = args['JOB_RUN_ID']
This function should help you get the job_run_id : Call it right in the beginning of your python job
def get_running_job_id(job_name):
session = boto3.session.Session()
glue_client = session.client('glue')
try:
response = glue_client.get_job_runs(JobName=job_name)
for res in response['JobRuns']:
print("Job Run id is:"+res.get("Id"))
print("status is:"+res.get("JobRunState"))
if res.get("JobRunState") == "RUNNING":
return res.get("Id")
else:
return None
except botocore.exceptions.ClientError as e:
raise Exception("boto3 client error in get_status_of_job_all_runs: " + e.__str__())
except Exception as e:
raise Exception("Unexpected error in get_status_of_job_all_runs: " + e.__str__())
Apparently the
getResolvedOptions(sys.argv, ["JOB_NAME","JOB_RUN_ID"])
works only for Pyspark jobs. You can confirm this by firing a shell job and Pyspark job and doing a print(sys.argv) and having a look at the entire list of arguments returned.
For Job Run ID :
import boto3
glue_client = boto3.client("glue")
response = glue_client.get_job_runs(JobName = <your job name>)
job_run_id = response["JobRuns"][0]["Id"]
Use this code as early as possible within the Python shell job to get the job run id of the most recent execution.
For Job Name : There is a programmatic way to derive job name which I have explained below.
In a scenario where the job name and the python script name are same, we can read the the first element of sys.argv and then use below :
job_name = sys.argv[0].split('/')[-1]
returns scriptname.py. Strip the ".py" if you need only the name part.
Conteúdo relevante
- AWS OFICIALAtualizada há 2 anos
- AWS OFICIALAtualizada há um ano
- AWS OFICIALAtualizada há 2 anos
- AWS OFICIALAtualizada há 2 anos
I have tried this, but the job is throwing an error -> KeyError: JOB_RUN_ID, and as per your message it seems like job is trying to retrieve a job parameter which is not even passed. Without passing job run id in the job parameters how can we retrieve it from script?