There was a major version upgrade in my Amazon Relational Database Service (Amazon RDS) for MySQL or Amazon Aurora MySQL-Compatible DB instance. Now, I experience slow query performance and high CPU usage.
Resolution
Prerequisites:
To avoid misdiagnosis issues, correctly structure your troubleshooting workflow and use AWS tools and services. For more information, refer to the following articles:
Check for upgrade-specific issues
When you upgrade an instance, such as from Aurora MySQL-Compatible 5.7 to Aurora MySQL-Compatible 8.0, the upgrade might remove certain features from the instance.
If your instance uses deprecated features, then you might experience issues with the performance of your instance. To check if your instance version uses deprecated features that affect your workload, see Release notes for Amazon Aurora MySQL-Compatible Edition. Or, see MySQL 8.0 Release notes on the MySQL website.
In the following example scenario, you upgrade an Aurora MySQL-Compatible 5.7 cluster to Aurora MySQL-Compatible 8.0. After the upgrade, the writer instance CPU doubles, even though the queries, execution plans, and workload patterns are the same.
To troubleshoot this issue, you can review the metrics for your Amazon RDS for MySQL or Aurora MySQL-Compatible cluster. Based on the issue, compare the related metrics for the cluster from the previous version to the upgraded version.
Example Aurora MySQL-Compatible 5.7 metrics:
SHOW GLOBAL STATUS LIKE 'Qcache%';
Example Aurora MySQL- Compatible 8.0 metrics:
`mysql> SHOW STATUS LIKE 'Qcache%';`
`+-------------------------+--------+`
`| Variable_name | Value |`
`+-------------------------+--------+`
`| Qcache_free_blocks | 36 |`
`| Qcache_free_memory | 138488 |`
`| Qcache_hits | 79570 |`
`| Qcache_inserts | 27087 |`
`| Qcache_lowmem_prunes | 3114 |`
`| Qcache_not_cached | 22989 |`
`| Qcache_queries_in_cache | 415 |`
`| Qcache_total_blocks | 912 |`
`+-------------------------+--------+`
In this example, Aurora MySQL-Compatible 8.0 removed a feature that served queries from the cache to avoid full executions. Without this feature, Aurora MySQL fully executes all queries and doubles the workload.
If you see spikes in the instance CPU, then review the following metrics:
- Queries
- Com_select
- Innodb_rows_read
To resolve issues with these metrics, complete the following tasks:
Additional troubleshooting steps
You can compare the previous version of the execution plans to the upgraded version execution plans to better troubleshoot issues. To compare parameter group settings, use the EXPLAIN FORMAT=JSON query. Critical configurations, such as innodb_buffer_pool_size must be the same in both versions. For more information, see Configuring InnoDB Buffer Pool Size and EXPLAIN Statement on the MySQL website.
It's a best practice to use the Aurora cloning feature to test upgrades in a staging environment. You can use tools such as Sysbench to simulate workloads in the staging environment. Then, use AWS tools such as AWS Database Insights and Enhanced Monitoring to monitor key metrics.