AWS Glue 3.0 Docker Image - can we increase SPARK configurations ?

0

My Spark code is running extremely slow and either timing out or not running at all . I though have just 4 records in the source S3 bucket that I am trying to process. Can anyone suggest if we can increase the SPARK power in the docker to make it run faster ?

I am using the AWS Glue 3 Docker instance to set up my local environment. I am using a notebook to submit the jobs and being a newbie , do not know how to change the spark-submit configuration in this env

asked 2 years ago646 views
2 Answers
1

Hello,

If you are using a Jupyter notebook with your Glue docker image, then you can use Spark magic command %%configure to set the Spark's driver and executor memory/vcores depending on your system/computer.

You can get a list of available spark magic commands by simply running %help in a cell

The command looks something like below

%%configure
{
    "driverMemory":"2000M",
    "executorMemory":"2000M"
}

You can get list of configurable Spark parameters from here

AWS
SUPPORT ENGINEER
answered 2 years ago
  • This did not work for me

0
Accepted Answer

I figured out that I was not using the right Docker Image. Since I was on a Mac M1 , I had to use the Glue3.0 ARM image instead I was using the AMD image causing this problem . Closing this question.

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions