How to pass EMR Serverless PySpark entryPointArguments as variable

0

I have an EMR Serverless PySpark job I am launching from a step function. I am trying to pass arguments to SparkSubmit from the entryPointArguments in the form of variables set in the beginning of the step function i.e. today_date, source, tuned_parameters, which I then use in the PySpark code.

I was able to find a partial solution in this post here however I am trying to pass variables from the step function and not the hardcoded argument i.e.. "prd".

  "JobDriver": {
          "SparkSubmit": {
            "EntryPoint": "s3://xxxx-my-code/test/my_code_edited_3.py",
            "EntryPointArguments": ["-env", "prd", "-source.$", "$.source"]
          }
        }

Using argparse I am able to read the first argument "-env" and it is successfully returning "prd", however I am having troubles figuring out how to pass a variable for the source argument.

已提問 1 年前檢視次數 1394 次
1 個回答
1
已回答 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南