emr serverless unexpected behavior after changing resources from default

0

Made an emr application with custom resources - specifically changed executors have more cores and more memory (different amount of memeory from default change for that number of cores, but within the range suggested). Ran a pyspark job using this new application. Previously had tested the same pyspark job with default serverless resources. Spark UI monitoring during run:

  1. Showed default number of executor cores rather than what was set in the application created above (which it still shows the resources as set).
  2. Showed differences in the way stages were executed in the job compared to (a) the history of the run of same job from the job run on default emr serverless resources setting, (b) and see below.
  3. Towards the end of execution of the job the pattern shown in the UI for the job earlier (previous point) and very close to the end and at the end looked similar to pattern showing in the run from application created with default emr serverless resources. [For clarity: the change in middle of run to towards the end is not a simple time-scale rendering in the visualization]
  4. At the end (a) Jobs UI showed similar to the the same application run with default resources of emr serverless (as mentioned above, (b) the same job in the two different application configurations took about the same time to complete the job, (c) the application with modified resource configuration however used about three times the resources. [REMEMBER: the Environment tab of the executors showed same number of cores for executor in the Application where the resources were modified]

QUESTION: Despite taking same time to complete and showing same number of executor cores as the default emr serverless Application (1) THE RESOURCE UTILIZATION AND THE BILLED RESOURCE WERE 3 TIMES IN THE APPLICATION WHERE THE RESOURCES WERE MODIFIED & (2) THE JOB EXECUTION PATTERN CHANGE IN DISPLAY AS THE JOB PROCEEDED.

Can someone clarify as to why/how that can happen?

已提問 2 個月前檢視次數 135 次
沒有答案

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南