Skip to content

Streamlining EC2 AMI management with S3 Tables and S3 Metadata

7 minute read
Content level: Intermediate
0

This article describes how to use Amazon Simple Storage Service (Amazon S3) Tables and metadata collection to efficiently manage and retrieve archived Amazon Machine Images (AMIs).

Introduction

AMIs provide the software required to configure and boot Amazon Elastic Compute Cloud (Amazon EC2) instances, and are a critical part of backup and restore strategies. To store Amazon EC2 AMIs and optimize AWS Backup costs, many organizations use Amazon S3. However, this approach presents challenges in maintaining the relationship between stored AMIs and source EC2 instances. This is because this connection isn't automatically preserved when you export AMIs to S3.

With multiple customers looking for a scalable and cost-effective method, the AWS Cloud Support engineering team identified a solution that's simpler, more efficient, and cost-effective. The solution is to use features such as Amazon S3 Metadata and Amazon S3 Tables to streamline the experience and preserve critical relationships between AMIs and EC2 instances. This solution also allows for efficient querying across millions of objects.

Solution overview

This solution uses S3 Tables and metadata that you can query to efficiently manage your archived AMIs stored in Amazon S3. When you export EC2 AMIs to Amazon S3, the system automatically captures and indexes essential metadata. By creating a dedicated S3 Table for your AMI metadata, you maintain crucial relationships between stored AMIs and source EC2 instances through a structured format.

Integration with Amazon Athena provides SQL-based querying capabilities and allows you to efficiently search across millions of objects. This approach significantly reduces operational overhead compared to traditional methods that require object iteration or separate mapping databases.

Implementation

Prerequisites

Step 1: Create an S3 Table bucket

  1. Open the Amazon S3 console.
  2. In the navigation pane, choose Table buckets.
  3. On the Table buckets page, choose Create table bucket.
  4. Under General configuration, provide a name for your bucket.
  5. Under Integration with AWS analytics services, select Enable integration.
  6. Choose Create table bucket.
    Note:
    Integration with AWS analytics services is required to access S3 Table buckets from AWS query engines such as Athena. Enter image description here

Step 2: Turn on S3 Metadata collection

  1. Open the Amazon S3 console.

  2. In the navigation pane, under Buckets, choose the S3 bucket that stores the AMIs.

  3. On the bucket details page, under Metadata, choose Create metadata configuration.

  4. Choose Browse S3 and select the S3 Table bucket that you created.

  5. Under Metadata table name, provide a name for the metadata table.

  6. Choose Create metadata configuration.
    Note
    : Initial metadata collection can take several hours to complete. Enter image description here

Enter image description here

Step 3: Use Athena to interact with the S3 Metadata table

  1. Open the Amazon S3 console.

  2. In the navigation pane, under Buckets, choose the S3 bucket that stores the AMIs.

  3. On the bucket details page, under Metadata, choose Go to Athena query editor. This opens the Athena query editor dashboard. This dashboard has a separate tab that includes the corresponding S3 Metadata table to query.

  4. To get Amazon S3 object metadata information from your S3 Metadata table, run queries similar to the example metadata table queries.

For more information on S3 Table bucket pricing, see the Amazon S3 Pricing page. Enter image description here

Example queries:

You can use the following example SQL query to find all AMIs that AWS Backup created for any instance. This query is useful for environments with many EC2 instances. You can also use the query to quickly identify all AMIs created in AWS Backup without specific instance IDs.

SELECT
    bucket,
    key,
    size,
    last_modified_date,
    -- Access map values directly using the key
    user_metadata['ami-name'] as ami_name,
    user_metadata['ami-registration-date'] as registration_date,
    user_metadata['ami-description'] as description,
    user_metadata['ami-owner-account'] as owner_account,
    user_metadata['ami-store-date'] as store_date
FROM
    aws_s3_metadata.test_metadata_table
WHERE
    -- Filter for a specific instance ID
    user_metadata['ami-name'] LIKE '%AwsBackup%'
ORDER BY
    last_modified_date DESC

Note:

  • For large environments, such as environments with more than 1,000 instances, add a LIMIT clause to prevent excessive data scanning.
  • To narrow results by date range, you can add additional WHERE conditions similar to the following example: AND last_modified_date BETWEEN date1 AND date2
  • Because the query uses S3 Table indexing capabilities, it's highly scalable and more efficient than object iteration with API calls.
  • If you have specific naming conventions for your AMIs, then add additional filters as needed.

You can use the following SQL query to extract data from your S3 Metadata table based on the instance ID:

SELECT
    bucket,
    key,
    size,
    last_modified_date,
    -- Access map values directly using the key
    user_metadata['ami-name'] as ami_name,
    user_metadata['ami-registration-date'] as registration_date,
    user_metadata['ami-description'] as description,
    user_metadata['ami-owner-account'] as owner_account,
    user_metadata['ami-store-date'] as store_date
FROM
    aws_s3_metadata.test_metadata_table
WHERE
    -- Filter for a specific instance ID
    user_metadata['ami-name'] LIKE '%YOUR_INSTANCE_ID%'
ORDER BY
    last_modified_date DESC

Note:

  • Replace YOUR_METADATA_TABLE_NAME with the name of your metadata table.
  • Replace YOUR_INSTANCE_ID with the instance ID, such as i-0abc123def456789.
  • For more precise matching, use a more specific pattern: LIKE '%i-0abc123def456789%'
  • If you're tracking multiple instances, then use the IN operator.

This approach is more efficient than making multiple DescribeImages API calls, especially when dealing with hundreds of instances.

Note: Athena queries are priced per TB of data scanned. For current pricing details, visit the Athena Pricing page.

Cleanup

To no longer collect S3 object metadata, delete the metadata table configuration. To delete the metadata table from your S3 Table bucket, you can use the AWS CLI or SDK. When you implement Amazon S3 metadata, follow best practices across system-defined and user-defined object metadata. For S3 Tables, create dedicated buckets with analytics integration turned on and optimize query performance through proper partitioning. It's a best practice to implement robust security through IAM policies and encryption, while monitoring costs and performance. You can also consider Regional availability, backup strategies, and integration requirements for your implementation.

Conclusion

When you use S3 Tables and S3 Metadata that you can query, you can efficiently manage your archived EC2 AMIs while maintaining the relationship with source instances. This solution reduces operational overhead and provides cost-effective metadata management through simplified querying. Also, this solution addresses common challenges in AMI archival strategies. Instead of managing complex tag propagation during export or maintaining separate metadata tracking systems, S3 Tables automatically preserves the relationship context of your AMIs. The metadata collection process runs automatically in the background and makes sure that your AMI catalog stays current without manual intervention. AWS Support engineers and Technical Account Managers (TAMs) can help you with general guidance, best practices, troubleshooting, and operational support on AWS. To learn more about our plans and offerings, see AWS Support.

About the authors Enter image description here

Jatin Makani

Jatin Makani is a Senior Technical Account Manager (TAM) supporting Energy customers in AWS Enterprise Support . Jatin is passionate about Cloud Governance, DevOps, Infrastructure as Code, modernization and solving complex customer issues.

Enter image description here

Karthik Kukkapalli

Karthik Kukkapalli is a Senior Technical Account Manager (TAM) in AWS Enterprise Support. He provides support and guidance to energy sector customers, helping them optimize their AWS usage. As an Amazon SageMaker expert, Karthik assists customers on their AI/ML journey with AWS. His technical expertise and dedication ensure that customers can fully leverage AWS's capabilities, driving their operational success. Outside of work, Karthik enjoys spending time with his family and friends.