Advice to reduce the costs of reading large amounts of files in S3

0

I recently acquired a new client that stores hundreds of thousands of files per month on Amazon S3. The volume of files is growing, leading to significant costs primarily due to frequent file retrievals (over 4 million calls per month). File sizes vary from a few kilobytes to several gigabytes. I consider this a tricky topic since they store PDFs, images, videos, etc.

Currently, they use a straightforward application that reads files from S3 and delivers them to end-users. To optimize costs, I'm exploring solutions like Amazon CloudFront for caching and reducing data transfer costs. However, I lack direct experience with CloudFront. My client vet architects believe that using CloudFront will increase the costs since they have thought that AWS wouldn't offer any option that could reduce our costs since their business model relies on charging you per reading. However, I don't agree with them.

What strategies or best practices would you recommend to optimize costs in this scenario? Are there specific configurations or services within AWS that could help reduce costs associated with frequent file retrievals and large data transfers from S3? The main goal is to reduce costs as much as possible.

Any insights or recommendations would be greatly appreciated! Thank you.

profile picture
asked 3 months ago306 views
2 Answers
2

Hello.

As you know, using CloudFront for caching reduces the number of requests to S3.
In cases like this, where there is a lot of access and a lot of data transfer, I think it is often cheaper to distribute from CloudFront.
As shown in the pricing table below, CloudFront is set up so that the more data you transfer, the cheaper the charges will be.
https://aws.amazon.com/cloudfront/pricing/?nc1=h_ls
a

profile picture
EXPERT
answered 3 months ago
profile picture
EXPERT
reviewed 3 months ago
1

Your cost structure sounds rather straightforward. The request fees for just 4 million requests would be on the order of a few dollars/euros per month, so the bulk of your costs should be coming from data transfer out to the internet. Data transfer from S3 origins to CloudFront edge locations is free, with the traffic charges applying instead to the traffic from the CloudFront edge locations to your users.

The list prices for traffic outbound to the internet aren't significantly different for CloudFront (in the price class 100, assuming it's suitable for you) compared to S3, but as you can see at the bottom of the CloudFront pricing page https://aws.amazon.com/cloudfront/pricing/, the first 1 TB of traffic and the first 10,000,000 requests are free, meaning that the price list applies to usage exceeding those amounts. That might be notable if your use is relatively small.

For example, your 4 million requests would all be covered for the per-request fees, with only the requests made by CloudFront to S3 when the requested file isn't available in CloudFront's cache getting charged. The amount of data traffic you didn't mention, so I can't say what fraction of it CloudFront's free 1 TB would be, but that would get charged based on the CloudFront pricing sheet for the amount exceeding 1 TB.

EXPERT
Leo K
answered 3 months ago
profile pictureAWS
EXPERT
reviewed 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions