Skip to content

Questions regarding potential data loss during EBS snapshot creation

0

Hello,

I have two questions regarding the behavior of AWS EBS snapshots:

  1. I understand that AWS EBS snapshots are crash-consistent. In this context, does the snapshot capture and preserve the volume's state at the moment when the snapshot creation API (SDK) call returns a response? Or is it necessary to pause I/O to the volume until the snapshot reaches the completed state to ensure consistency?

  2. Related to the above: as far as I understand, after the snapshot creation request is made and a response is returned, AWS stores the block-level metadata first, and the actual block data is copied asynchronously in the background. In a scenario where the snapshot is still in a pending state, and a block (let's call it Block A) has not yet been uploaded to the S3-based snapshot storage, what happens if write I/O is issued to Block A? Will the snapshot mechanism ensure that the original data in Block A is safely copied to S3 before the write proceeds, or is there a risk that the original data could be lost from the snapshot?

I am not asking for implementation details, but would appreciate confirmation on whether there is any risk of data loss from the snapshot under the above conditions.

Thank you.

asked 8 months ago200 views
2 Answers
1

To add to Agent's reply.

If this is Windows, you can take application-consistent snapshots of attached EBS volumes using AWS Systems Manager Run Command. The snapshot process uses the Windows Volume Shadow Copy Service (VSS) to take EBS volume level backups of VSS-aware applications. More information at Application consistent Windows VSS based Amazon EBS snapshots. AWS Backup also supports Windows VSS snapshots.

AWS
EXPERT
answered 8 months ago
0

Let me address your questions about EBS snapshot behavior:

  1. When you create an EBS snapshot, the snapshot is created immediately at the point in time when the API call is made, but it remains in a 'pending' state until all data has been transferred to Amazon S3. The snapshot captures the data that was written to the volume at the time the snapshot command was issued. However, this might exclude any data that has been cached by applications or the operating system.

For a completely consistent snapshot, it's recommended to pause file systems or applications that might have cached data not yet written to disk. If you cannot pause all file writes, you should consider unmounting the volume, issuing the snapshot command, and then remounting the volume to ensure a consistent and complete snapshot. You can continue to use your volume while the snapshot status is pending.

  1. Regarding your concern about data integrity during the pending state: When a snapshot is initiated, Amazon EBS creates a point-in-time backup immediately. The snapshot mechanism ensures that even if you continue to use the volume and modify blocks that haven't yet been transferred to S3, the original data at the time of the snapshot request will be preserved in the snapshot.

The snapshot process works in the background and captures the state of the volume at the exact moment the snapshot was initiated. Any subsequent writes to the volume after the snapshot command is issued will not affect the data being captured in the snapshot. This means there is no risk of data loss from the snapshot under the conditions you described - if Block A is modified after the snapshot request but before it's transferred to S3, the snapshot will still contain Block A's state at the time of the snapshot request.

This asynchronous process allows you to continue using your volume without interruption while ensuring the integrity of the point-in-time backup.
Sources
CreateSnapshot - Amazon Elastic Compute Cloud
Create Amazon EBS snapshots - Amazon EBS

answered 8 months ago
AWS
EXPERT
reviewed 8 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.