Skip to content

AWS BACKUP : Unexpectedly High S3 API Costs (ReadObjectTagging) from Tag-Based AWS Backup Plan

0

Hello AWS Community,

We have recently configured an AWS Backup plan with the goal of protecting our S3 buckets. The resource assignment was set up to automatically include any S3 bucket with a specific tag (e.g., backup-policy: daily).

Shortly after activating this tag-based assignment, we observed a massive and unexpected spike in our S3 API costs, increasing our daily spend significantly. Cost Explorer analysis shows this increase is almost entirely from ReadObjectTagging and ReadACL operations, running into many millions of requests per day.

I have a screenshot of the cost increase from Cost Explorer to illustrate the impact. We've done some investigation, and I want to clarify a few key points about our setup: The backup plan is configured for periodic snapshots (daily), not a 'Continuous Backup' plan for Point-in-Time-Recovery (PITR).

The tags are applied at the S3 bucket level, not on individual objects within the buckets.

Our assumption was that by tagging the bucket, AWS Backup would simply identify the bucket and initiate the backup, without needing to inspect the tags of the millions of individual objects inside.

During our research, we found this re:Post article: https://repost.aws/questions/QUW1CrY9QRR9yqyVPoK6NmAQ/s3-what-needs-to-readacl-and-readobjecttagging() In that thread, the user suggests that changing the backup type to continuous fixed the issue for them. However, this doesn't seem like the right solution for our use case, as we do not require Point-in-Time-Recovery and are concerned that enabling continuous backup would simply trade the high API costs for even higher backup storage costs. This leads me to the following questions:

  1. Is this high volume of ReadObjectTagging API calls the expected behavior when using tag-based resource selection for S3 in AWS Backup, even when only tagging the buckets?

  2. Could you help me understand the underlying mechanism? Why does the service seem to perform object-level operations when the selection criteria are defined at the bucket level?

  3. What is the AWS-recommended best practice to cost-effectively manage S3 backups at scale without incurring these high API costs? Is switching the resource assignment method from dynamic (tags) to static (specifying bucket ARNs directly) the most effective solution?

Thank you.

Article reference: https://repost.aws/knowledge-center/backup-organization-backups https://repost.aws/questions/QUW1CrY9QRR9yqyVPoK6NmAQ/s3-what-needs-to-readacl-and-readobjecttagging

asked 3 months ago78 views
2 Answers
0

Yes, the initial spike of API cost is expected.

Since, AWS Backup supports the following metadata: tags, access control lists (ACLs), user-defined metadata, original creation date, and version ID. You may also restore all backed-up data and metadata except original creation date, version ID, storage class, and e-tags.

For buckets with more than 300 million objects:

  • Continuous backups are recommended.
  • If backup lifecycle is planned for more than 35 days, you can also enable snapshot backups for the bucket in the same vault in which your continuous backups are stored.

Please refer to Best practices and cost considerations for S3 backups: https://docs.aws.amazon.com/aws-backup/latest/devguide/s3-backups.html#bestpractices-costoptimization

To avoid the recurring API costs for larger S3 Buckets, it is recommended to go with S3 Continuous backups.

Also, make sure using features of AWS KMS, CloudTrail, Amazon CloudWatch, and Amazon GuardDuty as part of your backup strategy can result in additional costs beyond S3 bucket data storage. If enabled, please Disable CloudTrail Data events and Exclude AWS KMS events.

AWS
answered 3 months ago
  • Thank you @ARK for sharing the useful information.It is really useful.Though I want to know for smaller buckets If we switch from tag-based resource selection to static resource assignment for S3 backups, will AWS Backup still incur the same level of ReadObjectTagging and GetObjectAcl API calls during object enumeration? Or are there any optimization techniques—beyond using Continuous Backups and lifecycle rules—that can reduce these per-object metadata scan costs for very large buckets?

0

Both tag-based resource selection and static resource assignment for S3 backups should work the same way when scanning the s3 objects and taking backups via API calls.

AWS
answered 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.