AWS Glue 3.0 Docker Image - can we increase SPARK configurations ?

0

My Spark code is running extremely slow and either timing out or not running at all . I though have just 4 records in the source S3 bucket that I am trying to process. Can anyone suggest if we can increase the SPARK power in the docker to make it run faster ?

I am using the AWS Glue 3 Docker instance to set up my local environment. I am using a notebook to submit the jobs and being a newbie , do not know how to change the spark-submit configuration in this env

已提問 2 年前檢視次數 666 次
2 個答案
1

Hello,

If you are using a Jupyter notebook with your Glue docker image, then you can use Spark magic command %%configure to set the Spark's driver and executor memory/vcores depending on your system/computer.

You can get a list of available spark magic commands by simply running %help in a cell

The command looks something like below

%%configure
{
    "driverMemory":"2000M",
    "executorMemory":"2000M"
}

You can get list of configurable Spark parameters from here

AWS
支援工程師
已回答 2 年前
  • This did not work for me

0
已接受的答案

I figured out that I was not using the right Docker Image. Since I was on a Mac M1 , I had to use the Glue3.0 ARM image instead I was using the AMD image causing this problem . Closing this question.

已回答 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南