I want to monitor my Spark usage with AWS Glue, and optimize costs.
Resolution
Monitor usage
To get a summary of the cost of Spark usage in your AWS Glue jobs, use AWS Cost Explorer.
Complete the following steps:
- Open the AWS Billing and Cost Management console.
- In the navigation pane, choose Cost Explorer.
- On the Cost dashboard, view the monthly costs for AWS Glue.
View usage by job detail
To monitor AWS Glue job details, such as its run status, run duration, or data processing unit (DPU) usage, complete the following steps:
- Open the AWS Glue console.
- Under ETL Jobs, choose Job run monitoring.
View cost by type of job
To get the costs for a specific type of AWS Glue job, complete the following steps:
- Open the AWS Billing and Cost Management console.
- Under Cost and usage analysis, choose Cost Explorer.
- Under Report parameters, in the Filters section, for Service, choose Glue.
- Under Usage type, select the filter for your job and include your AWS Region:
For a standard job, use the ETL-DPU-Hour filter. For example, for the US West (Oregon) Region, apply USW2-ETL-DPU-Hour.
For a flex job, use the ETL-Flex-DPU-Hour filter. For example, apply USW2-ETL-Flex-DPU-Hour.
For an interactive session, use the GlueInteractiveSession-DPU-Hour. For example, apply USW2-GlueInteractiveSession-DPU-Hour.
Get the usage and cost for a specific job
To get the cost for a specific AWS Glue job, complete the following steps:
- Open the AWS Glue console.
- Under ETL Jobs, choose Job run monitoring.
- Find the DPU hours that you used for the job.
- On the AWS Glue pricing page, on the ETL jobs and interactive sessions tab, select your Region.
- Note the cost of each DPU-HOUR for your job type.
- To calculate the cost, multiply your DPU hours by the cost for each DPU-HOUR.
To get AWS Glue job metrics for memory or CPU usage or data traffic, set up a CloudWatch alarm.
To get notifications about your AWS Glue job, see How do I receive Amazon SNS notifications when my AWS Glue job changes states?
Optimize cost
To optimize costs for Spark usage in AWS Glue jobs, take the following actions:
- Tune the AWS Glue job to reduce the job run duration and required number of workers.
- Define the AWS Glue job execution type as Flex for non-critical AWS Glue jobs.
- Turn on Auto Scaling for your AWS Glue job.
- Create an AWS Glue usage profile to restrict the worker types, limit maximum worker counts, and limit a job's run duration.
- Set an appropriate AWS Glue timeout for your Jobs API to so that the job doesn't run more than the expected runtime.
- Use an appropriate worker type and worker count.
- To test and develop in a local environment, use a Docker image. AWS Glue Docker uses local computational resources instead of AWS Glue cloud services.
- Use the AWS Glue job bookmark feature to incrementally process data.
- Use CloudWatch metrics to identify the optimal DPU capacity, and adjust the capacity as needed.
- Stop or delete interactive sessions when you're not using them. Use the Jupyter magic %stop_session, or choose Stop session on the Interactive Session page of the AWS Glue console. You can also use Stop Notebook to terminate the session.
Note: The session remains active, even if you close the browser tab.
Related information
Monitor and optimize cost on AWS Glue for Apache Spark