AWS Glue 3.0 Docker Image - can we increase SPARK configurations ?

0

My Spark code is running extremely slow and either timing out or not running at all . I though have just 4 records in the source S3 bucket that I am trying to process. Can anyone suggest if we can increase the SPARK power in the docker to make it run faster ?

I am using the AWS Glue 3 Docker instance to set up my local environment. I am using a notebook to submit the jobs and being a newbie , do not know how to change the spark-submit configuration in this env

질문됨 2년 전666회 조회
2개 답변
1

Hello,

If you are using a Jupyter notebook with your Glue docker image, then you can use Spark magic command %%configure to set the Spark's driver and executor memory/vcores depending on your system/computer.

You can get a list of available spark magic commands by simply running %help in a cell

The command looks something like below

%%configure
{
    "driverMemory":"2000M",
    "executorMemory":"2000M"
}

You can get list of configurable Spark parameters from here

AWS
지원 엔지니어
답변함 2년 전
  • This did not work for me

0
수락된 답변

I figured out that I was not using the right Docker Image. Since I was on a Mac M1 , I had to use the Glue3.0 ARM image instead I was using the AMD image causing this problem . Closing this question.

답변함 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠