How do I troubleshoot high CPU utilization on my Amazon RDS for SQL Server instance?
I'm experiencing high CPU utilization on my Amazon Relational Database Service (Amazon RDS) for Microsoft SQL Server DB instances.
Short description
The following reasons can cause increases in CPU utilization:
- User-initiated heavy workloads, multiple concurrent queries, or long-running transactions
- Use of an under-provisioned instance class for the workload
- Out-of-date statistics and index fragmentation or missing indexes
- Queries with scope for improvement
- Blocking and deadlocks
- Parallelism
- Frequent compilation and recompilation
- Parameter sniffing
- Thread exhaustion
To identify the source of high CPU utilization for your Amazon RDS for SQL Server instance, use the following tools:
- Amazon CloudWatch metrics for Amazon RDS
- Enhanced Monitoring
- Performance Insights
- Database Insights
- SQL Server tools
After you identify the source, you can analyze and optimize your workload to reduce high CPU utilization.
Resolution
CloudWatch metrics for Amazon RDS
Use CloudWatch metrics for Amazon RDS to identify CPU patterns over extended periods.
To find the times when the workload increases CPU utilization, complete the following steps:
- Open the Amazon RDS console.
- In the navigation pane, choose Databases, and then select the database you want to monitor.
- Choose the Monitoring tab.
- Choose the Monitoring menu, and then choose CloudWatch.
- Search for the following CloudWatch metrics, and then compare the graphs:
WriteIOPs
ReadIOPs
ReadThroughput
WriteThroughput
CPUUtilization
CPUCreditBalance
CPUCreditUsage
Note: If the CPU credit balance consistently decreases and the CPU credit usage consistently increases, then there aren't enough CPU cores for the workload. If you use an instance from the t2 or t3 instance class, then check if your instance is under provisioned.
After you identify the time frame, use Enhanced Monitoring to review the data that's associated with your DB instance in more detail. You can set Enhanced Monitoring to collect data at intervals of 1, 5, 10, 15, 30, or 60 seconds.
Enhanced monitoring
You can set up Enhanced Monitoring to monitor the operating system (OS) that's running on your DB instance.
To check CPU utilization with Enhanced Monitoring, complete the following steps:
- Open the Amazon RDS console.
- In the navigation pane, choose Databases, and then select the database you want to monitor.
- Choose the Monitoring tab.
- Choose the Monitoring menu, and then choose OS process list.
Verify if the OS processes, RDS processes, SQL Server processes, or a SQL Agent processes are driving high CPU utilization. You can also check what percentage of CPU and memory these processes are utilizing.
Use the console to monitor performance metrics. On the Monitoring tab, choose the Monitoring menu, and then choose Manage graphs.
To determine when the CPU is running user processes, running kernels, or is idle, select the graphs for CPU User, CPU System, CPU Idle metrics. Then, select the metrics for Disk I/O and Physical Device I/O. These metrics include Read IO/s, Write IO/s, Read Kb/s, and Write Kb/s. You can also view memory-related parameters including Available Memory, SQL Server Total Memory, and Total Memory. These metrics are helpful because if the CPU spends more time waiting for resources, you might see high CPU utilization.
For more information, see Viewing OS metrics in the RDS console.
Database insights
Use CloudWatch Database Insights to troubleshoot the major contributors of database load and individual OS processes that are running on a group of instances.
By default, Standard mode of Database Insights is enabled for your Amazon RDS database. To turn on Standard mode when you create or modify your DB instance, see Turning on the Standard mode of Database Insights for Amazon RDS.
Note: If you aren’t using broader permissions, then be sure to grant the required IAM permissions for Database Insights. For more information, see Get started with CloudWatch Database Insights.
For more information about Amazon RDS and instance class support, see Amazon RDS DB engine, Region, and instance class support for Database Insights.
Performance insights
Turn on Performance Insights to identify queries that are responsible for the database load.
Complete the following steps:
- Access the Performance Insights dashboard.
- Check the Top SQL tab that corresponds with the time-frame you want to analyze.
- Identify the query that's taking the longest.
- Check the resource-intensive query and the wait events observed during this period. The following are wait events frequently associated with high CPU utilization:
SOS_SCHEDULER_YIELD shows that a worker thread yielded for another thread to run. When wait counts are high with low wait times, the queries are usually CPU bound. When worker threads yield, wait times in the CPU-bound queries might increase. If the wait time is long, then you must review the workload. If SOS_SCHEDULER_YIELD is prevalent, then CPU pressure is the problem. Review the type of workload and perform additional tuning.
CXPACKET and CXCONSUMER are parallelism-related wait events that aren't usually a concern. But if wait events are frequent and are affecting performance, then review the queries and set appropriate values for the cost threshold of parallelism. Make sure that SQL Server is choosing the lower cost parallelism parameter in the parameter group. You can also increase the max degree of parallelism, MAXDOP, to 1 at the query or instance level.
ThreadPool shows thread exhaustion. If your instance class can handle it, then increase the max worker threads parameter. You might experience ThreadPool waits when excessive threads are used because of blocking, a high workload, or a high number of parallel queries. Also, if you misconfigure the max worker threads parameter, then you might experience the ThreadPool wait event.
Check the Database Metrics for batch requests, SQL compilations, and SQL recompilations. Check for queries that are being compiled more than once. Also check if the queries are frequently recompiled for a given batch. If so, then this indicates that the WITH RECOMPILE clause is used in the query code. Both of these reasons can cause excessive CPU usage.
SQL Server tools
To troubleshoot high CPU utilization with SQL Server tools, take the following actions:
- Analyze specific queries for high CPU utilization, update statistics, missing indexes, and parameter sniffing. For more information, see Troubleshoot high-CPU-usage issues in SQL Server on the Microsoft website.
- Check the execution plan for poorly performing queries, and then perform additional tuning. For more information, see Display an actual execution plan on the Microsoft website.
- Use queries and Extended Events to troubleshoot excessive locking, blocking, and deadlock-related issues. For more information, see Understand and resolve SQL Server blocking problems on the Microsoft website.
Note: When you configure Extended Events in RDS for SQL Server, you can't use the standard method to save .xel files. - Use SQL Server reports to fine tune your workload. For more information, see Performance Dashboard on the Microsoft website.
- Language
- English

Relevant content
- asked 2 years ago
- Accepted Answerasked 2 years ago
- asked 10 months ago
- asked 2 years ago