- Newest
- Most votes
- Most comments
Hi,
AWS Glue is a serverless fully managed service and it has been pre-optimized, while using the --conf parameter allows to change some of the spark configuration it should not be used unless documented somewhere as for example in the migration from Glue 2.0 to Glue 3.0 (or 4.0).
Most configuration changes as the one you are trying to pass will not be taken in consideration. If you need to scale your job further increase the number of workers to increase the number of executors.
In case you really need the flexibility to choose the number of executors and the memory configuration , and even different instance types I would suggest to look at EMR (on EKS or Serverless) to run your spark code.
hope this helps,
Reading through this documentation: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-glue-arguments.html#:~:text=AWS%20Glue%20uses%20the%20following%20arguments%20internally%20and%20you%20should%20never%20use%20them%3A It does not mentions about not using --conf. Is there any other documentation mentioning not using --conf?
Relevant content
- asked 10 months ago
- asked 2 years ago
- asked 2 years ago
- asked 2 years ago
- AWS OFFICIALUpdated 3 months ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
If you run "Standard" node type it will run 2 executors per worker but as Fabrizio says, don't try to change those settings on Glue, it's meant to be managed for you