AWS RDS Postgres Snapshot questions

0
  • When manual snapshot is taken, its consistency with what point in time? Start of the snapshot or end?
  • Does RDS Snapshots underlying use EBS snapshots? How does it guarantee consistency? what happens to dirty blocks not yet written to disk? Does the snapshot also copies transaction logs?
  • Is manual offline backup possible like when it takes before upgrade to make it a consistent backup?
  • If System backups are incremental and manual backups are always full backups - If multiple manual snapshots are taken say every 6 hours, will each manual snapshot be full backup or incremental backup with diff changes from last backup?
  • I have noticed inconsistent backup times, first manual snapshot after db restore takes few mins for a 1TB database - assuming its bcz first system snapshot has completed and only incremental changes are being stored in first manual snapshot. But when a pre-upgrade manual snapshot is triggered by upgrade process which started shortly after first snapshot and there has been hardly any changes made in the db - the second snapshot takes about 30 mins. Why would it take longer?
2 Answers
1

The first snapshot of a DB instance contains the data for the full database. Subsequent snapshots of the same database are incremental, which means that only the data that has changed after your most recent snapshot is saved.

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithAutomatedBackups.html

Your DB instance must be in the available state to take a DB snapshot. https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_CreateSnapshot.html

When a manual snapshot is taken in Amazon RDS, the snapshot captures the state of the database at the start of the snapshot process. However, RDS ensures that the snapshot is consistent, meaning that any changes made during the snapshot creation are not reflected in the snapshot itself. This ensures that the snapshot represents the database's state at the time the snapshot began, even though the snapshot process may take some time to complete.

The snapshot process is handled in the background, and the database remains available for reads and writes during this time, but only data from the start of the snapshot is included.

profile picture
EXPERT
answered a month ago
0

While Oleksii provided all the necessary information, I am adding more context, and details to your questions. I also recommend to read this blog - Amazon RDS: Snapshot, restore, and recovery demystified

1. Consistency of Manual Snapshots

When you take a manual snapshot of an Amazon RDS PostgreSQL instance, the snapshot captures the state of the database at the start of the snapshot process. This ensures that the snapshot reflects a consistent point in time, even though the process might take time to complete. The snapshot will reflect the database as it was at the moment the snapshot was initiated, including any in-flight transactions that were committed before that point.

Refer to this section for more details on snapshot creation.

2. Underlying Use of EBS Snapshots and Consistency

By default, Amazon RDS creates and saves automated backups of your DB instance securely in Amazon S3 for a user-specified retention period. In addition, you can create snapshots, which are user-initiated backups of your instance that are kept until you explicitly delete them. You can create a new instance from database snapshots whenever you desire. When a snapshot is initiated, Amazon RDS ensures that all pending writes are flushed to disk, ensuring data consistency at the start of the snapshot. This process guarantees that any "dirty blocks" (changes in memory not yet written to disk) are written before the snapshot starts.

When automated backups are turned on for your DB Instance, Amazon RDS automatically performs a full, daily snapshot of your data and captures transaction logs. When you initiate a point-in-time recovery, transaction logs are applied to the most appropriate daily backup to restore your DB instance to the specific time that you request.

More information on how this works can be found here.

3. Manual Offline Backup

AWS RDS does not allow manual, truly offline backups where the instance is stopped entirely for backup purposes(If you try to take a snapshot of a DB in a stopped state manually in EC2, it would fail: “Cannot create a snapshot because the database instance <instance name> is not currently in the available state. (Service: AmazonRDS; Status Code: 400; Error Code: InvalidDBInstanceState; “)

Manual snapshots themselves are still taken from a running instance, so it is more about managing database activity during the snapshot process.

Refer to this section

4. Incremental vs. Full Snapshots

Although manual snapshots are always full snapshots from your perspective, RDS snapshots are incremental at the storage level. This means that each snapshot after the first one only stores the blocks that have changed since the last snapshot. Although database snapshots serve operationally as full backups, you are billed only for incremental storage use. These snapshots are designed to be consistent by coordinating with the underlying PostgreSQL database.

Refer to this section for more details.

5. Inconsistent Backup Times

In the case of the first manual snapshot after a database restore, the shorter time for the initial snapshot can be attributed to minimal changes since the restore.

However, in the second case, when the pre-upgrade manual snapshot takes longer - because Amazon RDS takes two DB snapshots during the upgrade process. The first DB snapshot is of the database before any upgrade changes have been made. If the upgrade fails for your databases, you can restore this snapshot to create a database running the old version. The second DB snapshot is taken after the upgrade is completed.

Additionally, upgrades themselves can trigger additional processes or consistency checks, which could contribute to the increased duration.

For more details on upgrading postgresql, check the Amazon RDS Documentation.

profile picture
answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions