- Newest
- Most votes
- Most comments
Lifecycle is a great mechanism within S3 to automatically delete files based on certain criteria. You can specify a filter by using object size, object key prefix, one or more object tags, or a combination of filters. So one way would be to tag your objects when they are created/uploaded and apply the lifecycle policy. https://docs.aws.amazon.com/AmazonS3/latest/userguide/intro-lifecycle-rules.html
Another way would be utilize S3 Inventory report. Identify objects that you want to delete and leverage SDKs/API's as show below to submit request for multiple deletes. https://docs.aws.amazon.com/AmazonS3/latest/userguide/delete-multiple-objects.html https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html
You can also use Athena queries to identify the objects from s3 inventory. https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-inventory-athena-query.html
I built this and working for my requeirment #!/bin/bash
SINCE_today=date --date '-13 weeks -1 days' +%F 2>/dev/null || date -v '-2w' -v '+2d' +%F
bucket=POC_DEV_Bucket
aws s3api list-objects --bucket "$bucket" --query 'Contents[?LastModified < '"$SINCE_today"']' --output text > 90days_old.txt && grep -i ".json" 90days_old.txt > s3file_with_extension_with_daysold.txt && cat s3file_with_extension_with_ZERO_daysold.txt && awk '{$1= ""; print $2}' s3file_with_extension_with_daysold.txt | xargs -I {} aws s3 rm s3://"$bucket"/{}
I have Lifecycles enabled, but some bucket which has huge data in terabytes need to be deleted by using particular extension like .json,.csv,.html with 90 days old.
so wondering is there a lambda or some automation to do that before I needed to work on new code
Relevant content
- asked 9 months ago
- asked 2 years ago
- AWS OFFICIALUpdated 7 months ago
- AWS OFFICIALUpdated a month ago
- AWS OFFICIALUpdated 8 months ago
- AWS OFFICIALUpdated 5 months ago