I want to troubleshoot slow performance when my gateway on AWS Storage Gateway uploads to AWS.
Resolution
Review your internet bandwidth or network throughput to AWS
The internet speed between your gateway and AWS can affect upload performance. To determine the available internet bandwidth to your gateway, run a network test from a virtual machine (VM). Or, use a system that's on the same network as your gateway device.
For example, your gateway connects to AWS through an Amazon Virtual Private Cloud (Amazon VPC) endpoint. The endpoint is for Amazon Simple Storage Service (Amazon S3) and connects over an AWS Direct Connect or VPN connection. When this happens, run a network throughput test from an on-premises VM to an instance in the VPC.
If you host your gateway on-premises and connect to AWS through a VPC endpoint, then run a different test. For example, the endpoint is for Storage Gateway over AWS Direct Connect or a VPN connection. In this example, the traffic from the gateway to the S3 bucket crosses the public virtual interface or public internet. If the public virtual interface or internet connection is congested, then your gateway's upload performance might be affected. To allow traffic to cross the private virtual interface, set up your gateway with an Amazon S3 PrivateLink VPC endpoint. When you use this configuration, you must create and configure an Amazon Elastic Compute Cloud (Amazon EC2) proxy on your gateway device.
Check the size of the files that are written to the Storage Gateway device
When you upload larger files, Storage Gateway generally performs better than when you upload smaller files. This is because Storage Gateway breaks up large files in to multiple parts, and then uploads the parts in parallel streams to the S3 bucket.
Run tests with the file sizes and the number of threads to benchmark the upload speed from the gateway to AWS. Then, review the CloudBytesUploaded metric to determine the upload speed.
Review the gateway's cache storage
If you use a file gateway, then check your CachePercentDirty metric. Any data that's written to the gateway that isn't already written to Amazon S3 is considered dirty. A CachePercentDirty metric that's higher than 80% can indicate slow uploads from the gateway to Amazon S3.
If the CachePercentDirty metric is high, then check the CloudBytesUploaded metric to see if the upload speed to Amazon S3 is slow. If the upload speed is slow, then increase the internet bandwidth that's available to the gateway.
Also, check your gateway's IoWaitPercent metric on Amazon CloudWatch. If you see that your gateway's IoWaitPercent metric is higher than 10% during testing, then there might be an issue with your gateway. The gateway might have a disk that doesn't have enough I/O to handle the workload. Use the SampleCount statistic to review the WriteBytes metric and check your total write I/O to AWS.
If your gateway's cache disk doesn't have enough I/O to handle the workload, then change the cache disk to a faster disk type. For example, use an SSD or NVMe-backed SSD disk. Attach another cache disk to your gateway to help increase the available aggregate I/O to the gateway.
Check the configuration of your gateway's host VM or Amazon EC2 instance
Confirm that the CPU and RAM of your gateway's host VM or EC2 instance supports your gateway's throughput to AWS. For example, every EC2 instance type has a different baseline throughput. If burst throughput is exhausted, then the instance uses its baseline throughput. This limits the upload throughput to AWS.
If your gateway is hosted on an EC2 instance, then check the NetworkOut metric of the instance. If the NetworkOut metric sits at the baseline throughput during your testing, then change the instance to a larger instance type. A larger instance type achieves more network throughput.
Check the geographical distance between your gateway and the dataset
It's a best practice to deploy your gateway in the same network as your dataset. Or, deploy it in a network that's geographically close to your dataset. Don't set up connections over a Wide Area Network (WAN). An example of this is a gateway that you deploy on an EC2 instance with the file share mounted over AWS Direct Connect or a VPN. The latency from on-premises traffic to AWS over the WAN connection affects how fast the data gets to the gateway. This latency eventually affects the upload speed to the S3 bucket. To help reduce upload latency, deploy your gateway in the same AWS Region as the S3 bucket that you use as the file share.