EMR-S Application Configuration (spark default) is not shown for tasks submitted using airflow

0

Hi Team,

We are invoking our EMR-S jobs using an airflow EMR job submit operator. The EMR application is configured with a set of spark default run time configurations. while invoking the job from airflow those configurations are not listed in the job configuration section for that job run. however, if we submit a directly using the console, the configurations from the application is taken and listed in the specific job run. how to enable this for the jobs that are triggered from airflow. Basically I have application configuration as Java 17 ,but if the job is triggered from airflow its still using JAva 8.

2 回答
0

Hello,

Could you share your DAG code after removing any sensitive information, if any?

Thanks!

AWS
支持工程师
Nitin_S
已回答 1 个月前
0

Hello,

In general as mentioned in the document[1], there are three options when overriding the configurations in the job run for an EMR Serverless application.

  1. Override an existing configuration
  2. Add an additional configuration
  3. Remove an existing configuration

I suspect that the MWAA EMR serverless Job run/submit operator might be removing an existing configuration by passing an empty set which is overriding the the default configuration defined at application level.

To confirm this, i would recommend you to compare the Cloudtrail events of StartJobRun invoked from API Console and from MWAA.

References: [1] https://docs.aws.amazon.com/emr/latest/EMR-Serverless-UserGuide/default-configs.html#default-configs-override

AWS
已回答 1 个月前
profile picture
专家
已审核 1 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则