How can I initiate restores for a large volume of Amazon S3 objects that are currently in the S3 Glacier or S3 Glacier Deep Archive storage class?

6 minute read
0

I have a large number of objects in the Amazon Simple Storage Service (Amazon S3) Glacier or Amazon S3 Glacier Deep Archive storage class. I want to initiate a restore on all of these objects in a large-scale operation.

Resolution

To restore a large volume of Amazon S3 Glacier storage class objects, you can use either of the following options:

  • Amazon S3 Batch Operations
  • A custom script that you created using AWS Command Line Interface (AWS CLI)

Note: If you receive errors when running AWS CLI commands, make sure that you're using the most recent version of the AWS CLI.

Use an S3 batch operation

Create an Amazon S3 Batch Operations job to initiate the restore for all the objects. You can run an S3 Initiate Restore Object job on a custom list of objects or an Amazon S3 inventory report.

Note: Before you create a job, be sure to review the pricing for Amazon S3 Batch Operations.

Before you begin to create an S3 Batch Operations job, be sure that the following requirements are met:

  • You have an AWS Identity and Access Management (IAM) user or role that has permissions to initiate a restore. Additionally, the IAM user or role must have a trust policy with Amazon S3 batch operations.
  • You have a CSV list or an Amazon S3 inventory report to serve as the manifest of the objects that you want to restore. The manifest file must be stored on an S3 bucket. Manifests with server-side encryption (customer-provided keys or AWS Key Management Service keys) aren't supported. For more information on the requirements for each format, see Specifying a manifest.

To create a batch operation job using the Amazon S3 console for initiating a restore, do the following:

1.    Open the Amazon S3 console.

2.    From the navigation pane, choose Batch operations.

3.    Choose Create job.

4.    For Region, select the AWS Region where you want to create the job.

5.    Under Choose manifest, enter the following:
For Manifest format, select S3 inventory report or CSV as your file format.
For Path to manifest object, enter the S3 path to the manifest file (Example: s3://awsexamplebucket/manifest.csv).

6.    Choose Next.

7.    Under Choose operation, enter the following:
For Operation, select Restore.
For Restore source, select Glacier or Glacier Deep Archive.
For Number of days that the restored copy is available, enter the number of days for your use case.
For Restore tier, select either Bulk retrieval or Standard retrieval. For more information on each tier, see Archive retrieval options.

Note: S3 batch operations don't support the Expedited retrieval tier.

8.    Choose Next.

9.    Under Configure additional options, enter the following:
For Description, you can choose to enter a description of the job. Or, you can leave this field blank.
For Priority, enter a number to indicate the job's priority.
For Generate completion report, choose to keep this option selected.
For Completion report scope, select Failed tasks only or All tasks depending on your use case.
For Path to completion report destination, enter the path that you want the report to be sent to.
For Permission, select Choose from existing IAM roles. Then, select the IAM role that has permissions to initiate a restore and has a trust policy with S3 batch operations.

10.    Choose Next.

11.    On the Review page, review the details of the job. Then, choose Create job.

12.    After you create the job, the job's status changes from New to Preparing. Then, the status changes to Awaiting your confirmation. To run the job, you must select the job and then choose Confirm and run. The job doesn't run until you confirm it.

13.    (Optional) If you selected Generate completion report, then review the report after the job completes. You can find the report at the Path to completion report destination that you specified.

For descriptions of each job status, see Job statuses.

For more information on failed jobs, see Tracking job failure.

Use a custom AWS CLI script

You can restore your Amazon S3 Glacier objects using the AWS CLI restore-object command. However, the command can only restore one S3 Glacier object at a time and doesn't support the bulk restore action. Therefore, use the following custom solution for restoring bulk data from the S3 Glacier storage classes using the available retrieval options.

Note: You must test these custom scripts in a test or development environment before using them in your production environment. The custom commands restore all the objects in S3 Glacier storage classes one by one. If you have very many objects, the command might timeout. You can run the custom command using the Prefix parameter to scope down the number of objects.

For a Linux or Unix-based system, run the following command to recursively restore all the S3 Glacier objects in the bucket:

aws s3api list-objects --bucket <bucket-name> --prefix <prefix> --query 'Contents[?StorageClass==`GLACIER`][Key]' --output text | xargs -I {} sh -c "aws s3api restore-object --bucket <bucket-name> --key \"{}\" --restore-request Days=5,GlacierJobParameters={Tier=Standard} || true"

Be sure to do the following:

  • Replace <bucket-name> with the S3 bucket name.
  • Replace <prefix> with the S3 folder path.

For a Windows-based system, do the following:

1.    Run the following command to list all the S3 Glacier objects in the bucket:

aws s3api list-objects --bucket <bucket-name> --prefix <prefix> --query "Contents[?StorageClass==`GLACIER`][Key]" --output text > list.txt

When you run this command, the list of objects is saved in a file named list.txt.

2.    Run the following command to restore the S3 Glacier objects:

for /F "tokens=*" %i in (list.txt) do @aws s3api restore-object --bucket <bucket-name> --key "%i" --restore-request Days=5,GlacierJobParameters={Tier=Standard} || true"

Note: The preceding custom AWS CLI script incurs additional charges for the LIST and data retrieval requests. Because list-objects-v2 is a paginated operation, multiple API calls might be issued to retrieve the entire data set of results. For more information, see Amazon S3 pricing.

Related information

Creating an S3 Batch Operations job

Performing large-scale batch operations on Amazon S3 objects

Managing S3 Batch Operations jobs

AWS OFFICIAL
AWS OFFICIALUpdated a year ago
3 Comments

Simple as that, y'all!

replied 2 years ago

If you need to be more selective, I'd recommend filtering the object keys into a file in advance and then using xargs:

xargs -a object-keys.lst -rn1 \
  aws s3api restore-object \
  --restore-request '{"Days":5,"GlacierJobParameters":{"Tier":"Bulk"}}' \
  --bucket BUCKET --key
Keiran
replied 5 months ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied 5 months ago