Snapshots of encrypted EBS volumes

0

We know that the first snapshot of an EBS volume is a full copy and the next snapshots are incremental i.e. only the blocks that have been changed since the last snapshot are copied to S3. I want to know if this is true for encrypted volumes as well. Are the subsequent snapshots of an encrypted EBS volume full or incremental?

Gaurav
asked 2 years ago536 views
1 Answer
0

Yes it's the same for encrypted volumes - the storage is incremental but you can use any snapshot as if it was a full one. If you want more visibility you can query the sizes of your snapshots with a script like this Python one:

#
# Works out approximate EBS snapshot sizes using the "EBS Direct APIs", available since Dec 2019,
# to get the number of blocks in a volume's oldest snapshot, and number of changed blocks between
# each two snapshots.
# Still doesn't take into account any other optimisations like compression that AWS might do,
# but is a lot better than the console which just shows volume size for every snapshot.
# Also note by default the volume size is used as a quick approximation of the oldest snapshot's
# size because it contains all data for the volume. But often the oldest snapshot will actually
# be a lot smaller because EBS doesn't copy empty blocks.  You can get a more accurate estimate
# by passing "--list-blocks" or "-l" which will count the number of blocks in each oldest snapshot
# but it takes a lot longer.
# 
# Version History:
# 1-Jun-2022 Steve Kinsman - Initial version.

import boto3
import argparse

parser = argparse.ArgumentParser(description='EBS Snapshot Sizes Report')
parser.add_argument('--volumeid', type=str)
parser.add_argument('-l', '--list-blocks', action='store_true')
args = parser.parse_args()

ebs = boto3.client('ebs')
ec2 = boto3.client('ec2')


def initial_snapshot_size(snapshotid):
    """
    Calculate size of an initial snapshot in GiB by listing its blocks.
    """
    num_blocks = 0
    response = ebs.list_snapshot_blocks(
        SnapshotId=snapshotid,
        MaxResults=1000
    )
    blocksize_kb = response.get('BlockSize', 0) / 1024
    while True:
        num_blocks += len(response.get('Blocks', []))

        # check if there's more to retrieve    
        token = response.get('NextToken', '')
        if token == '':
            break
        response = ebs.list_snapshot_blocks(
            NextToken=token,
            SnapshotId=snapshotid,
            MaxResults=1000
        )
    return num_blocks * blocksize_kb / (1024 * 1024)

filters = []
if args.volumeid:
    # Filter snapshots by the specified volume id
    filters = [
        {
            'Name': 'volume-id',
            'Values': [
                args.volumeid
            ]
        }
    ]

snapshots = ec2.describe_snapshots(OwnerIds=['self'], Filters=filters).get('Snapshots', [])

if snapshots:
    # Sort snapshots by volume ID and then by tomestamp
    snapshots.sort(key=lambda snapshot: snapshot['VolumeId'] + str(snapshot['StartTime']))

    # For each volumeid go through the snapshots, reporting oldest one based on all data in the volume,
    # then subsequent ones have size calculated from the changed blocks
    v_prev = None
    sid_prev = None
    total_gb = 0
    num_volumes = 0
    for row in snapshots:
        v = row['VolumeId']
        sid = row['SnapshotId']

        # Strip off ms & timezone info
        timestamp = str(row['StartTime']).split('.')[0].split('+')[0]

        # Is this for the same volume as last time?
        # And not for the special vol-ffffffff whose snapshots won't be related to each other?
        if v == v_prev and v != 'vol-ffffffff':
            # Same volume as previous, so work out what changed
            change_info = ebs.list_changed_blocks(FirstSnapshotId = sid_prev,SecondSnapshotId = sid)
            num_changed = len(change_info['ChangedBlocks'])
            blockSize_kb = (change_info['BlockSize'] / 1024) if num_changed else 0
            gb = num_changed * blockSize_kb / (1024 * 1024)
            total_gb += gb
            print(f' - {timestamp}, {sid_prev} to {sid}: {gb:0.3f} GiB')
        else:
            # new volume
            num_volumes += 1
            gb = row['VolumeSize']
            # Have we been asked to get a more accurate initial snapshot size by listing its blocks?
            if args.list_blocks:
                gb = initial_snapshot_size(sid)
            total_gb += gb
            print(f'{v}:')
            print(f" - {timestamp}, Initial Snapshot {sid}: {gb:0.3f} GiB")
            v_prev = v
            sid_prev = sid
    print (f'Total snapshot storage estimate: {total_gb:0.3f} GiB across {num_volumes} volumes')
else:
    print('No snapshots in this region owned by me.')        

EXPERT
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions