EMR-S Application Configuration (spark default) is not shown for tasks submitted using airflow

0

Hi Team,

We are invoking our EMR-S jobs using an airflow EMR job submit operator. The EMR application is configured with a set of spark default run time configurations. while invoking the job from airflow those configurations are not listed in the job configuration section for that job run. however, if we submit a directly using the console, the configurations from the application is taken and listed in the specific job run. how to enable this for the jobs that are triggered from airflow. Basically I have application configuration as Java 17 ,but if the job is triggered from airflow its still using JAva 8.

2 réponses
0

Hello,

Could you share your DAG code after removing any sensitive information, if any?

Thanks!

AWS
INGÉNIEUR EN ASSISTANCE TECHNIQUE
Nitin_S
répondu il y a un mois
0

Hello,

In general as mentioned in the document[1], there are three options when overriding the configurations in the job run for an EMR Serverless application.

  1. Override an existing configuration
  2. Add an additional configuration
  3. Remove an existing configuration

I suspect that the MWAA EMR serverless Job run/submit operator might be removing an existing configuration by passing an empty set which is overriding the the default configuration defined at application level.

To confirm this, i would recommend you to compare the Cloudtrail events of StartJobRun invoked from API Console and from MWAA.

References: [1] https://docs.aws.amazon.com/emr/latest/EMR-Serverless-UserGuide/default-configs.html#default-configs-override

AWS
répondu il y a un mois
profile picture
EXPERT
vérifié il y a un mois

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions