Skip to content

Understanding PostgreSQL Version Upgrades in RDS/Aurora: A Comprehensive Guide

12 minute read
Content level: Advanced
0

The purpose of this upgrade handbook is to provide a comprehensive, end-to-end guide for upgrading your RDS and Aurora PostgreSQL instances and clusters. It walks you through every step of the upgrade process - from checking version compatibility and requirements to performing the upgrade and completing post-upgrade tasks. This handbook serves as a consolidated resource for any customer planning an upgrade or nearing a mandatory upgrade due to version deprecation.

What are Version Upgrades

RDS/Aurora PostgreSQL has 2 types of version upgrades :

a) Minor Version Upgrades introduce changes that maintain backward compatibility with existing applications. These upgrades, such as moving from version 16.8 to 16.9 or 17.4 to 17.5, typically involve patches to the binary installation of the database version. They play a crucial role in maintaining database stability and security. Each minor version release incorporates various bug fixes, addressing issues that range from minor inconveniences to critical security vulnerabilities.

Minor version upgrades can be initiated in two ways:

Manual Upgrade: You can manually trigger the upgrade by modifying your database instance.

Automatic Upgrade: By enabling the 'Auto minor version upgrade' option when creating or modifying your database, you allow Amazon RDS to manage the process. With this option, Amazon RDS will automatically upgrade your database to the new preferred version during your specified maintenance window, but only after testing and approving the new preferred version.

You can minimize downtime during minor version upgrades by using Amazon RDS Proxy or an open source database proxy such as PgBouncer. When upgrading minor versions of RDS Multi-AZ DB clusters, it’s possible to achieve downtime of 1 second or less by using the Advanced JDBC Wrapper Driver.

b) Major Version Upgrades can introduce substantial changes that may not be backward-compatible with existing applications. These upgrades can alter underlying data storage, modify core functionality, add or remove database parameters, and update catalog tables. Due to their significant impact, major version upgrades require manual initiation by modifying your DB instance or Multi-AZ DB cluster. This process often results in extended downtime, which can be problematic for business-critical applications. To mitigate this, consider employing alternative upgrade strategies such as blue-green deployments or logical replication. These methods can help minimize service disruption and provide more flexible upgrade paths, allowing for a smoother transition to the new database version while maintaining system availability.

The purpose of this article is to focus upon Major version upgrades.

Why to Upgrade?

Database upgrades are essential for maintaining a robust, secure, and efficient database environment. The PostgreSQL Global Development Group provides support for major versions for five years from their initial release. After this, a final minor version will be released and the major version will then be unsupported (end-of-life). Please refer first release and final release for more details.

Running an unsupported version not only poses significant security risks but also means you'll miss out on performance improvements - for example, features like logical replication and partitioning have improved significantly with each major release. Regular upgrades ensure you maintain a secure, efficient, and well-supported database environment while taking advantage of the latest features and improvements.

Before going for an upgrade, its recommended to go through the enhancements and bug fixes for RDS and Aurora.

How to Upgrade

Please find below the upgrade options arranged in decreasing order of downtime and increasing order of efforts required:

a) In-place upgrade: It’s the most straightforward option for upgrading your database. This method involves simply modifying your instance or cluster to a new major version through the AWS console or API. Under the hood, it utilizes the pg_upgrade utility to perform the version upgrade. If RDS is Multi-AZ, both primary and secondary are upgraded at the same time. The same applies to Aurora, all instances in the cluster are upgraded at the same time.

b) Blue-green deployment: A blue/green deployment is a upgrade strategy that minimizes risk and downtime by creating an identical copy of your production environment. The 'blue' environment represents your current production database serving live traffic, while the 'green' environment acts as a staging area that stays synchronized with production through logical replication. This parallel setup allows you to thoroughly test the upgraded version, validate application compatibility, perform performance testing, and verify data consistency in the green environment. Once testing confirms everything works as expected, you can perform a switchover from blue to green, which typically takes less than a minute, with zero data loss and no application code changes required. Please refer Blue/Green guide for more details.

c) Using DMS: With AWS DMS homogenous migration, you can migrate data from your source database to an equivalent engine on AWS using native database tools. Homogeneous data migrations are serverless, which means that AWS DMS automatically provision the resources that are required for your migration. With homogeneous data migrations, you can migrate tables, table partitions, data types, and secondary objects such as functions, stored procedures, triggers, indexes, and other database objects. When you create a migration project with compatible source and target data providers, AWS DMS connects to the source data provider, reads the source data, dumps the files on the disk, and restores the data using native database tools. Logical replication will need to be enabled for CDC. Please refer DMS Migration approach and best practices guide.

d) Using logical replication: Logical replication is a method of replicating data objects and their changes based on the replication identity of the objects and their changes. This approach enables replication of specific objects or databases, offering more granular control than physical replication. It operates on a transactional basis, where changes are decoded on the publisher side using a plugin (typically pgoutput) before being sent to the subscriber. You can implement this replication through either native PostgreSQL logical replication or the 'pglogical' extension. Both methods provide powerful tools for data synchronization, and upgrades scenarios, making logical replication particularly useful in upgrade strategies as it allows for minimal downtime and flexible testing of new versions. Please refer native and pglogical methods respectively.

e) Using pgactive extension: Although blue/green deployment is effective for minimizing downtime during major version upgrades, some customers opt for a rolling upgrade approach to maintain high availability. In this method, new nodes running the updated version can join with an existing node, configured to accept
writes, and set up for bidirectional data replication. This enables a gradual and smooth transition to the new database version while minimizing disruption to the production environment. The pgactive extension enables active-active replication for Amazon RDS for PostgreSQL, allowing all participating nodes to handle read and write operations simultaneously. Data changes are automatically synchronized across all nodes in the cluster using PostgreSQL logical replication. pgactive includes built-in conflict detection, automatic resolution methods (such as last-write-wins), and conflict monitoring. Please refer pgactive approach.

Pre-Upgrade Best Practices

  1. Thoroughly review the official PostgreSQL engine release notes for your target version. These notes document important changes, new features and potential breaking changes that might impact your applications. Release notes can be found here.

  2. When upgrading, your parameter group handling depends on your current configuration. If you're using a custom parameter group, you'll need to create a new one specifically for the target version and include any necessary parameter customizations. However, if your instances are using the default parameter group, no action is required - the system will automatically apply the target version's default parameter group after the upgrade.

  3. Prior to upgrading, verify the compatibility of all custom parameters with your target PostgreSQL version. This crucial pre-upgrade check helps prevent parameter-related issues and ensures a seamless upgrade process.

  4. Monitor your instance's FreeableMemory metric in Amazon CloudWatch before initiating the upgrade, as the upgrade process can be memory-intensive. Insufficient memory is a common cause of upgrade failures. If your available memory is running low, consider scaling up to a larger instance class before proceeding with the upgrade. This proactive approach to memory management helps ensure a successful upgrade process and prevents potential disruptions.

  5. CloudWatch metrics: We recommend minimally capturing following CloudWatch metrics, which can be served as crucial reference points for post-upgrade comparison. In addition to these metrics, consider going over all RDS/Aurora metrics and capture any additional metrics that make sense for your workload.

    a. CPU Utilization

    b. FreeableMemory

    c. DatabaseConnections

    d. Read/Write IOPS

    e. Read/Write Latency

    f. Read/Write Throughput

    g. BufferCacheHitRatio (for Aurora PostgreSQL only)

    h. Commit Latency and Commit throughput (for Aurora PostgreSQL only)

  1. Before upgrading, capture execution plans for your critical SQL queries to establish a performance baseline. Focus on frequently executed, latency-sensitive, and business-critical queries. These baseline execution plans serve as crucial reference points for post-upgrade performance comparison and optimization. After the upgrade, compare the new execution plans with the baselines to identify any changes in query performance, detect potential regressions, and optimize queries that may have been affected by the version change.

  2. Establish a comprehensive testing environment using a complete copy of your production data before proceeding with a major version upgrade. This step is crucial because major version upgrades are not backward-compatible and may require application modifications. You can create a test environment using two different approaches for RDS:

    Snapshot Method:

    • Create a snapshot of your production instance.
    • Choose the snapshot that you want to upgrade in the “Snapshots” section.
    • From Actions, choose Upgrade snapshot. The Upgrade snapshot page appears.
    • Choose the New engine version to upgrade to, and save changes to upgrade the snapshot.
    • Once upgraded, restore the snapshot.

    Read Replica Method:

    • Create a read replica of your production instance.
    • Promote the replica to a standalone instance once it has caught up and has no lag.
    • Upgrade this standalone instance to the target version.
    • Conduct your testing on the upgraded instance.

    In case of Aurora, you can either restore a snapshot and upgrade it or simply create a clone and proceed with an in-place upgrade on the cloned instance. For more information, please refer Aurora snapshot restores and cloning.

  3. Choose a maintenance window that minimizes impact on business operations and ensures adequate duration to accommodate all required upgrade tasks.

  4. When utilizing upgrade methods such as blue-green deployment, AWS DMS, logical replication, or the pgactive extension (methods b, c, d, or e respectively), it's crucial to ensure no replication lag between the source and target databases.

  5. During a major version upgrade in RDS, the system automatically creates a snapshot before initiating the pg_upgrade process. Since Amazon EBS snapshots are incremental, taking an additional manual snapshot immediately before the upgrade can help reduce the overall outage time.

  6. It's crucial to verify that automated backups are enabled. This is a critical prerequisite because the system will only create a pre-upgrade snapshot when automated backups are active. Additionally, Point-in-Time Recovery (PITR) functionality is only available with automated backups enabled. Without these automated backups, you'll lack these important safety and recovery mechanisms during the upgrade process, potentially putting your data at risk.

  7. Reindex bloated indexes and remove duplicate/unused indexes. Please refer Index Maintenance Guide.

  8. Check tables which are bloated and run vacuum. Please refer Diagnosing table and index bloat.

  9. PostgreSQL major version upgrades don't automatically update your database extensions. Most extensions should be upgraded after the database engine upgrade, while some require updating beforehand.

    The extensions to update before upgrade include the following:

    a. pgRouting

    b. postgis_raster

    c. postgis_tiger_geocoder

    d. postgis_topology

    e. address_standardizer

    f. address_standardizer_data_us

    Run the following command for each extension that's currently installed: ALTER EXTENSION PostgreSQL-extension UPDATE TO 'new-version';

    To learn more about upgrading the postgis extension, Please refer here.

Post-Upgrade Best Practices

  1. Run the 'ANALYZE' command immediately after the upgrade to ensure optimal query performance. This crucial step updates the database statistics used by the query planner to generate efficient execution plans. Current statistics about data distribution help the planner make informed decisions, preventing potential performance issues post-upgrade.

  2. Database instance upgrades necessitate an engine restart, which clears the shared buffers contents. This can lead to perceived performance degradation when queries are first executed on the upgraded instance, primarily due to the absence of relevant data in memory that was previously cached. To mitigate these performance issues resulting from a cold cache, you can execute a script that represents the most frequently run queries. You can also make use of the pg_prewarm extension to prewarm the shared buffers with specific tables/indexes.

    This approach efficiently loads the most relevant data into the shared buffers, ensuring subsequent queries don't suffer from cache misses. It's important to note that running 'SELECT *' queries on all database tables should be avoided. While this would warm the buffer cache, it may not prioritize the most relevant pages in memory, potentially leading to suboptimal performance.

  3. Make sure all applications are working fine, and only after that old instance/cluster is safe to be deleted (in case upgrade was done using method b, c, d or e).

  4. In the event of post-upgrade performance issues, refer to the pre-upgrade statistics and execution plans collected earlier for comparative analysis and troubleshooting.

  5. Analyze logs for any new slow-running queries that may have emerged post-upgrade.

  6. Verify that all parameter groups are correctly applied, especially if you created new custom parameter groups for the target version.

  7. Validate that all extensions are compatible and functioning correctly with the new version. To update an extension after a version upgrade, use the command : ALTER EXTENSION PostgreSQL-extension UPDATE TO 'new-version';

  8. Plan for any post-upgrade tasks recommended in the PostgreSQL release notes for your specific version upgrade.

AWS
EXPERT
published 10 months ago3.1K views