emr serverless unexpected behavior after changing resources from default

0

Made an emr application with custom resources - specifically changed executors have more cores and more memory (different amount of memeory from default change for that number of cores, but within the range suggested). Ran a pyspark job using this new application. Previously had tested the same pyspark job with default serverless resources. Spark UI monitoring during run:

  1. Showed default number of executor cores rather than what was set in the application created above (which it still shows the resources as set).
  2. Showed differences in the way stages were executed in the job compared to (a) the history of the run of same job from the job run on default emr serverless resources setting, (b) and see below.
  3. Towards the end of execution of the job the pattern shown in the UI for the job earlier (previous point) and very close to the end and at the end looked similar to pattern showing in the run from application created with default emr serverless resources. [For clarity: the change in middle of run to towards the end is not a simple time-scale rendering in the visualization]
  4. At the end (a) Jobs UI showed similar to the the same application run with default resources of emr serverless (as mentioned above, (b) the same job in the two different application configurations took about the same time to complete the job, (c) the application with modified resource configuration however used about three times the resources. [REMEMBER: the Environment tab of the executors showed same number of cores for executor in the Application where the resources were modified]

QUESTION: Despite taking same time to complete and showing same number of executor cores as the default emr serverless Application (1) THE RESOURCE UTILIZATION AND THE BILLED RESOURCE WERE 3 TIMES IN THE APPLICATION WHERE THE RESOURCES WERE MODIFIED & (2) THE JOB EXECUTION PATTERN CHANGE IN DISPLAY AS THE JOB PROCEEDED.

Can someone clarify as to why/how that can happen?

質問済み 2ヶ月前136ビュー
回答なし

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ