Cost effective methods for accessing S3 buckets cross-region

19 minute read
Content level: Advanced
2

To assist users who need to access S3 data cross-region to make the most effective choice for their use case.

When you need to access Amazon S3 buckets across AWS Regions, from Amazon EC2 instances or other AWS services, you might need to optimize for cost while aligning with your organizational priorities. This article covers six common patterns you should consider. Each pattern has different trade-offs related to organizational policies, network architecture, and cost.

For simplicity, we'll list these options in order of increasing data transfer costs. We use 1,000 GB/month as the total traffic estimate for a month. All prices are correct at time of writing.

Option 1: Utilize IPv6 and Egress only Internet Gateway (EIGW)

The lowest cost pattern is to deploy an EIGW, and use a dual stack endpoint to connect to S3.

Costs

With this pattern, you pay only for cross-region data transfer. The type of data transfer will depend on whether bytes are sent to or received from S3, though pricing is typically the same for both.

  • Bytes sent to S3 from EC2, for example during uploads:
    • No charge from S3.
    • EC2 data transfer charge, from the data transfer OUT From Amazon EC2 To {region} table.
  • Bytes received from S3, for example during downloads:
    • S3 data transfer charge, from the Data transfer OUT From Amazon EC2 To {region} table.
    • No charge from EC2

Example costs: For an EC2 instance in us-east-1, at the lowest volume tier outside of the 100 GB free tier, to reach an S3 bucket in us-west-2, this option would generate charges at the following rates:

  • Uploads/Downloads: $0.02 per GB transferred.

This gives a monthly total of $20.00 (1,000 GB * $0.02).

Other considerations

Your organization may have policies against this. Two concerns usually motivate these policies:

  1. You want to maintain a data perimeter, and an EIGW allows access to the internet. Adding a firewall layer could mitigate this, but firewalls add extra cost and complexity which are outside the scope of this guide. While a firewall might easily be configured to limit access to sites other than S3, restricting it to a specific bucket is more challenging and requires an advanced firewall. Helping customers solve that challenge is one of the motivations for PrivateLink, which will be discussed in other options.
  2. Traffic that passes through an Internet Gateway or EIGW may be classified differently than traffic that does not, even though it does not "go over the internet" to reach Amazon S3 endpoints (see "Does traffic go over the internet when two instances communicate using public IP addresses, or when instances communicate with a public AWS service endpoint?").

Additionally, you may not be ready to support IPv6 in your VPCs.

Option 2: Utilize Internet Gateway (IGW), and and EC2 instance with an IPv4 public IP

The next lowest-cost option is similar to Option 1, except that your EC2 instances require public IPv4 addresses and access the Amazon S3 IPv4 endpoints.

Costs

Costs are similar to Option 1, with one addition: you are charged for each public IPv4 address you use.

Example costs: For an EC2 instance in us-east-1, at the lowest volume tier outside of the 100 GB free tier, to reach an S3 bucket in us-west-2, this option would generate charges at the following rates:

  • Uploads/Downloads: $0.02 per GB transferred.
  • Per Hour: $0.005 per hour per public IP.

This gives a monthly total of $23.65 ($20.00 (1000 GB * $0.02) + $3.65 ($0.005 * 730 hours)).

Other considerations

In addition to the concerns listed for Option 1, this option adds the need to block inbound access. While a simple security group can do this (and does by default), the same is needed for every resource with a public IP in the subnet the IGW is attached to. Validating and verifying this at a per-instance, per-security-group level is more complex than never provisioning an IGW at all. Because security is critical, defense-in-depth techniques that create easily verifiable barriers are valuable.

As such, most organizations choose a different option.

Option 3: VPC interface endpoint (VPCE) with VPC peering

In this option, you create PrivateLink interface endpoints for S3 in each region where you have a bucket that needs to be accesssd, and then create cross-region VPC peering connections to the VPCs that provide those interface endpoints.

Costs

Your costs involve:

  • PrivateLink interface endpoint: $0.01 per AZ per hour.
  • PrivateLink data processing charges: tiered pricing between $0.01-$0.004 per GB. These charges apply to bytes sent to and received from S3.
  • Bytes sent to S3 from EC2: Same as Option 1.
  • Bytes received from S3: Same as Option 1.
    • However, instead of an S3 data transfer charge, you will see an EC2 data transfer charge. The reason this is different than Option 1 is that this traffic is charged for entering the VPC where the interface endpoint exists. S3 then bills that traffic like it's coming from the region the interface endpoint resides. Since those rates are the same the actual amount should be the same, but it does change how it appears on bills and in AWS Cost Explorer.

Example costs: For an EC2 instance in us-east-1 (at the lowest volume tier) to reach an S3 bucket in us-west-2, this option would generate charges at the following rates:

  • Uploads/Downloads: $0.03 per GB transferred (from $0.01 (VPCE) + $0.02 (inter-Region data transfer))
  • Per Hour: $0.01 per hour per VPCE.

This gives a monthly total of $37.30 ($30.00 (1000 GB * $0.03) + $7.30 ($0.01 * 730 hours)).

Other considerations

Creating a large number of VPC peering connections can be a management challenge. Additionally, quotas exist for the number of peering connections per VPC, and complex environments may run into those quotas. These motivations often encourage moving toward Transit Gateway (TGW) solutions, despite their extra cost (see Option 5). VPC peering connections might also be something your organization avoids or discourages in a security policy. In many general scenarios, compared to TGW solutions, VPC peering security is harder to manage at scale.

While the relevance of these first concerns would be lower for a central VPC that contained only VPC interface endpoints, the general policies might still be a blocker. Changing those policies to be more flexible, or pursuing exceptions, may not be a high priority if TGWs are already in use for other purposes.

Those considerations accounted for, this option provides these advantages:

  • The PrivateLink interface endpoint only enables access to S3, not other sites or services. This helps maintain a data perimeter approach.
  • You can define a VPC endpoint policy, limiting the S3 buckets accessible via the endpoint, furthering consistency with data perimeter approaches.
  • Data clearly stays within managed networks, which satisfies common customer policies.

After choosing this option, another decision is needed: should endpoints be centralized or distributed? Because there is a per-AZ per-hour charge for endpoints, having an endpoint for every source VPC is inefficient. Instead, reusing a centralized VPC endpoint with multiple source VPCs attached will help minimize that cost. As you are already defining connectivity between VPCs via VPC peering, this is a matter of connecting multiple source VPCs from one region to a single VPC in the destination region. If the number of VPCs needing a destination is greater than the VPC peering connection quota, use multiple destination VPCs in a 50:1 ratio.

Option 4: S3 Multi-Region Access Points with Global interface endpoint

The next option utilizes a feature of Amazon S3 called Multi-Region Access Points. These route all S3 data request traffic through a single global endpoint, and allow you to directly control the shift of S3 data request traffic between AWS Regions at any time. In this use case, you could create a Multi-Region Access Point with your single bucket added to it. All requests to Multi-Region Access Points must be addressed to their Amazon Resource Name (ARN) - they do not have a bucket-style alias.

Additionally, you need to create a PrivateLink endpoint for the Multi-Region Access Point (com.amazonaws.s3-global.accesspoint). These PrivateLink interface endpoints allow access to Multi-Region Access Points globally from the region where they are created. However, it's important to note that these endpoints do not grant access to all buckets globally. They only allow access to a bucket through its participation in a Multi-Region Access Point. It's also worth noting that the endpoints themselves are not global. Like all VPC interface endpoints, they are regional, with zonal IPs, and specific to the VPC to which they are attached.

Costs

Your costs involve:

Example costs: For an EC2 instance in us-east-1 (at the lowest volume tier) to reach an S3 bucket in us-west-2, this option would generate charges at the following rates:

  • Uploads/Downloads: $0.0333 per GB transferred. ($0.01 (VPCE) + $0.02 (Inter-region data transfer) + $0.0033 (Multi-Region Access Point data routing)).
  • Per Hour: $0.01 per hour per VPCE.

This gives a monthly total of $40.60 ($33.30 (1000 GB * 0.0333 per GB) + $7.30 ($0.01 * 730 hours))

Other considerations

The primary use cases for Multi-Region Access Points are performance and multi-region resilience, rather than cost. As such, this option has fewer published examples and may be less well-known. Your organization may not have used Multi-Region Access Points in the past, and thus have no established policies or guardrails.

Those considerations accounted for, this option provides these advantages:

  • The Multi-Region Access Point PrivateLink interface endpoint only enables access to S3, not other sites or services. This helps maintain a data perimeter approach.
  • You can define a Multi-Region Access Point policy, limiting the Multi-Region Access Points and thus buckets accessible via the endpoint, furthering consistency with data perimeter approaches.
  • Data clearly stays within managed networks, which satisfies common customer policies.
  • With an S3 Multi-Region Access Point in front of the S3 bucket and s3-global VPC endpoints local to the compute, there is no need for VPC peering.

In addition, after choosing this option, another decision is needed: should endpoints be centralized or distributed? Because there is a per AZ per hour charge for endpoints, having an endpoint for every VPC is inefficient. Instead, reusing a centralized VPC endpoint with multiple source VPCs attached will help minimize that cost.

Option 5: VPC endpoints with Transit Gateway

This option is similar to Option 3, but instead of connecting the VPCs with VPC peering, a Transit Gateway is used.

Costs

Your costs involve:

  • PrivateLink interface endpoint: $0.01 per hour per AZ per hour.
  • PrivateLink data processing charges: tiered pricing between $0.01-$0.004 per GB. These charges apply to bytes sent and received to/from S3.
  • Transit Gateway attachment: $0.05 per attachment per hour (required in both VPCs).
  • Transit Gateway per GB data processed: $0.02 per GB
  • Bytes sent to S3 from EC2: This is the same as for Option 1.
  • Bytes received from S3: Instead of a S3 data transfer charge, you will see a EC2 data transfer charge. The reason this is different than Option 1 is that this traffic is charged for the purpose of entering the VPC where the interface endpoint exists. S3 then sees that billing of that traffic like it's coming from the region the interface endpoint resides. Since those rates are the same the actual amount should be the same, but it does change how it appears on bills and in AWS Cost Explorer.

Example costs: For an EC2 instance in us-east-1 (at the lowest volume tier) to reach an S3 bucket in us-west-2, this option would generate charges at the following rates:

  • Uploads/Downloads: $0.05 per hour transferred. ($0.01 (VPCE) + $0.02 (inter-Region data transfer) + $0.02 (TGW data processing))
  • Per Hour: $0.11 per hour. Based upon [$0.01 per hour (VPCE) + $0.10 per hour (TGW attachment x 2)]

This gives a monthly total of $130.30 ($50.00 (1000 GB * $0.05) + $80.30 ($0.11 * 730 hours))

However, if we assume that TGW attachments pre-existed this solution, the monthly total is $57.30 ($50.00 ($128.30 (1000 GB * $0.05) + $7.30 ($0.01 * 730 hours))

Other considerations

This option is the most feature heavy, but also has the highest costs. It provides these advantages:

  • The PrivateLink interface endpoint only enables access to S3, not other sites or services. This helps maintain a data perimeter approach.
  • You can define a VPC endpoint policy, limiting the S3 buckets accessible via the endpoint, furthering consistency with data perimeter approaches.
  • Data clearly stays within managed networks, which satisfies common customer policies.
  • Management of connections and routing between VPCs can utilize Transit Gateway routing capabilities.

Option 6: NAT Gateway

This option uses a NAT Gateway provisioned into the VPC with the EC2 instance, allowing access to the Amazon S3 IPv4 endpoints.

Costs

Your costs involve:

Example costs: For an EC2 instance in us-east-1 (at the lowest volume tier) to reach an S3 bucket in us-west-2, this option would generate charges at the following rates:

  • Uploads/Downloads: $0.065 per GB transferred. ($0.045 (NAT data processing) + $0.02 (inter-Region data transfer))
  • Per Hour: $0.05 per hour. ($0.045 per hour (NAT-GW) + $0.005 per hour (public IP))

This gives a monthly total of $101.50 ($65.00 (1000 GB * $0.065) + $36.50 ($0.05 * 730 hours).

However, if we assume that the NAT Gateway pre-existed this solution, the monthly total is $65.00.

Other considerations

See the considerations for Option 1.

Similar to VPC interface endpoints, a NAT Gateway can be used as a centralized solution, shared by multiple VPCs via VPC peering or a Transit Gateway. While this approach provides a simple solution, it may not scale well for high-volume usage.

Performance

For most use cases, all six options will have similar enough performance characteristics that this should not be a key decision point. However, you will want to monitor capacity for VPC endpoints and NAT Gateways.

NAT Gateways begin scaling after 5 Gbps, and VPC endpoints at 10Gbps, both scaling automatically up to 100 Gbps. If you exceed 100 Gbps, you'll want to scale horizontally by decentralizing and routing.

There are other limits specific to very large numbers of concurrent connections, but since S3 traffic is unlikely to be a primary contributor to those limits, this detail is not covered in depth here.

NAT Gateways would add the most latency of these solutions, but in most cases this difference is not significant enough to influence decisions.

Additional Examples

The options above provide an estimate for a single example. Because some costs are based on static provisioning and some are based on traffic volume, I wanted to provide some additional examples showing the influences there. In addition the examples so far focused on a single instance, in a single VPC, in a single AZ. While this might fit some use cases like a low-priority data migration, there are other cases where greater resilience and a larger number of participating resources will be involved. While I can't provide an example for every case, I did want to show a few.

Remember, when examining these examples, that if a static resource already exists, for example Transit Gateway VPC attachments, you won't need to provision another, and so that cost should be subtracted from these estimates that assume no existing resources are in place. If your organization uses Transit Gateway, many, maybe all, of your VPCs will already have attachments.

Example 1: Low volume, low priority, single source

This example is the one used throughout the earlier discussion. For most scenarios a higher degree of resilience is recommended. This example is offered to illustrate the most minimal requirements. It assumes a single VPC (and single instance for the Internet Gateway scenario), communicating with only one other region, and does not provide resilience to an availability zone impairment. Total traffic for the month is 1 TB with traffic passing from us-east-1 to us-west-2.

-(1) Egress Only Internet Gateway(2) Internet Gateway(3) VPC Peering(4) Multi-Region Access Point(5) Transit Gateway(6) NAT-Gateway
Traffic Costs$20.00$20.00$30.00$33.30$50.00$65.00
Static Costs-$3.65$7.30$7.30$80.30$36.50
Total Costs$20.00$23.65$37.30$40.60$130.30$101.50

Example 2: Low volume, low priority, development

In this second example, we show the impacts from needing to provision additional static resources due to having multiple regions, VPCs and instances. Like example 1, this example does not offer resiliency to availability zone impairments and is only appropriate to low priority cases, for example development in a cost conscious environment.

In this example, there are 4 regions, each with 10 VPCs, and each VPC has 3 EC2 instances (thus 120 total instances). Total traffic for the month is 1 TB. We'll assume traffic passes between regions with the similar rates to us-east-1 to us-west-2. A realistic example would be the set of us-east-1, us-west-2, eu-west-1, and eu-central-1.

-(1) Egress Only Internet Gateway(2) Internet Gateway(3) VPC Peering(4) Multi-Region Access Point(5) Transit Gateway(6) NAT-Gateway
Traffic Costs$20.00$20.00$30.00$33.30$50.00$65.00
Static Costs-$438.00$292.00$292.00$1,752.20$1,460.00
Total Costs$20.00$458.00$322.00$325.30$1,802.00$1,525.00

With centralization

If we use a central service VPC approach, we can reduce the number of static resources.

  • Egress Only Internet Gateway and Internet Gateway options have no such resources, so are not affected.
  • For VPC Peering, and Transit Gateway, this means instead of each VPC connecting to a corresponding VPC other regions, one VPC in each region receives traffic from all the VPCs in other regions.
  • For Multi-Region Access Points, this means that instead of each VPC having it's own Multi-Region Access Point global VPC Endpoints, one VPC per region contains such an endpoint, or endpoint per AZ, and other VPC connect to that VPC via VPC Peering.
  • For NAT Gateways, this means a single NAT Gateway, or one per AZ exists in each region, rather than each VPC having it's own set of NAT Gateways.
-(3) VPC Peering(4) Multi-Region Access Point(5) Transit Gateway(6) NAT-Gateway
Traffic Costs$30.00$33.30$50.00$65.00
Static Costs$29.20$29.20$1,635.20$146.00
Total Costs$59.20$62.50$1,685.20$211.00

Example 3: Low volume, redundant

In this example, we show how redundancy brings additional costs, but with some solutions accelerating faster than others.

In this example, there are 4 regions, each with 10 VPCs, and each VPC has 3 EC2 instances (thus 120 total instances). Total traffic for the month is 1 TB. We'll assume traffic passes between regions with the similar rates to us-east-1 to us-west-2. A realistic example would be the set of us-east-1, us-west-2, eu-west-1, and eu-central-1. There are also 3 AZs per VPC, each of which has redundant networking resources.

-(1) Egress Only Internet Gateway(2) Internet Gateway(3) VPC Peering(4) Multi-Region Access Point(5) Transit Gateway(6) NAT-Gateway
Traffic Costs$20.00$20.00$30.00$33.30$50.00$65.00
Static Costs-$438.00$876.00$876.00$2,336.00$4,380.00
Total Costs$20.00$458.00$906.00$909.30$2,386.00$4,445.00

With centralization

-(3) VPC Peering(4) Multi-Region Access Point(5) Transit Gateway(6) NAT-Gateway
Traffic Costs$30.00$33.30$50.00$65.00
Static Costs$87.60$87.60$1,693.60$438.00
Total Costs$117.60$120.90$1,743.60$503.00

Example 4: High volume, redundant

In this final example, we show how higher traffic volume causes traffic costs to be of higher importance. We use all the same resources as Example 3, but assume 100,000 GB of traffic per month. This amount would be equivalent to 38 MB/s of continuous traffic over the course of a month. Since all resources provisioned support 10 Gbps or higher, there's no need to provision extra networking resources to meet that traffic volume.

-(1) Egress Only Internet Gateway(2) Internet Gateway(3) VPC Peering(4) Multi-Region Access Point(5) Transit Gateway(6) NAT-Gateway
Traffic Costs$2,000.00$2,000.00$3,000.00$3,330.00$5,000.00$6,500.00
Static Costs-$438.00$876.00$876.00$2,336.00$4,380.00
Total Costs$2,000.00$2,438.00$3,876.00$4,206.00$7,336.00$10,880.00

With centralization

-(3) VPC Peering(4) Multi-Region Access Point(5) Transit Gateway(6) NAT-Gateway
Traffic Costs$3,000.00$3,330.00$5,000.00$6,500.00
Static Costs$87.60$87.60$1,693.60$438.00
Total Costs$3,087.60$3,417.60$6,693.60$6,938.00
profile pictureAWS
EXPERT
published 2 months ago888 views