I see constant or intermittent packet loss and high latency issues with my AWS Virtual Private Network (AWS VPN) connection. I want to know what tests to run to confirm that the issue doesn't occur inside my Amazon Virtual Private Cloud (Amazon VPC).
Short description
The causes of packet loss can vary as AWS VPN internet traffic moves between the on-premises network and Amazon VPC. It's a best practice to isolate and confirm where the packet loss comes from.
Resolution
To determine if you reached network limits, check the source and destination hosts for resource utilization issues. You can find resources utilization issues in values such as CPUUtilization, NetworkIn and NetworkOut, or NetworkPacketsIn and NetworkPacketsOut.
Use MTR to check for ICMP or TCP packet loss and latency
MTR provides a continuously updated output that lets you analyze network performance over time. It combines the functionality of traceroute and ping in a single network diagnostic tool. To check for ICMP or TCP packet loss and latency, install the MTR network tool on your Amazon Elastic Compute Cloud (Amazon EC2) instance that's in the VPC.
Amazon Linux:
sudo yum install mtr
Ubuntu:
sudo apt-get install mtr
Windows:
See the SourceForge website to install WinMTR.
Note: For Windows operating system (OS), WinMTR doesn't support TCP-based MTR.
Run the following tests between the private and public IP address for your EC2 instances and on-premises host. Because the path between nodes on a TCP/IP network can change when the direction is reversed, get MTR results from both directions.
Before you run the tests, check the following configurations:
- Make sure that the security group and network access control list (network ACL) rules allow ICMP traffic from the source instance.
- Make sure that the test port is open on the destination instance. Confirm that the security group and network ACL rules allow traffic from the source on the protocol and port.
The TCP-based results determine if there's application-based packet loss or latency on the connection. MTR version 0.85 and later have the TCP option.
Private IP address EC2 instance on-premises host report:
mtr -n -c 200
Private IP address EC2 instance on-premises host report:
mtr -n -T -c 200 -P 443 -m 60
Public IP address EC2 instance on-premises host report:
mtr -n -c 200
Public IP address EC2 instance on-premises host report:
mtr -n -T -c 200 -P 443 -m 60
Use traceroute to determine latency or routing issues
The Linux traceroute utility identifies the path from a client node to the destination node. The utility records the time in milliseconds for each router to respond to the request. The traceroute utility also calculates the amount of time each hop takes before it reaches its destination.
A few timed-out requests are common, so check for packet loss to the destination or in the last hop of the route. Packet loss over several hops might indicate an issue.
Note: It's a best practice to run the traceroute command from the client to the server. Then, run the command from the server back to the client.
To install traceroute, run the following commands:
Amazon Linux:
sudo yum install traceroute
Ubuntu:
sudo apt-get install traceroute
To test the private IP address of the EC2 instance and on-premises, run the following commands:
Amazon Linux:
sudo traceroutesudo traceroute -T -p 80
Windows:
tracerttracetcp
Note: These commands perform a TCP-based trace on port 80. Confirm that you have either port 80 or the port that you're testing open in both directions.
If you use Linux, then use the traceroute option to specify a TCP-based trace instead of ICMP. This is because most internet devices deprioritize ICMP-based trace requests.
Use hping3 to determine end-to-end TCP packet loss and latency problems
Hping3 on the die.net website is a command-line TCP/IP packet assembler and analyzer that measures end-to-end packet loss and latency over a TCP connection.
MTRs and traceroute capture per-hop latency. However, hping3 results show packet loss and end-to-end minimum, maximum, and average latency over TCP. To install hping3, run the following commands:
Amazon Linux:
sudo yum --enablerepo=epel install hping3
Ubuntu:
sudo apt-get install hping3
Run the following commands:
hping3 -S -c 50 -V <Public IP of EC2 instance or on-premises host>
hping3 -S -c 50 -V <Private IP of EC2 instance or on-premises host>
Note: By default, hping3 sends TCP headers to the target host's port 0 with a winsize of 64 and without the tcp flag on.
Packet capture samples with tcpdump or Wireshark
Perform simultaneous packet captures between your test EC2 instance in the VPC and your on-premises host when you duplicate the issue. This helps to determine if there are any application or network layer issues on the VPN connection. To perform packet captures, install tcpdump on your Linux instance or Wireshark on your Windows instance.
tcpdump on Amazon Linux:
sudo yum install tcpdump
Install tcpdump on Ubuntu:
sudo apt-get install tcpdump
Wireshark on Windows OS:
To install Wireshark, see the Wireshark website. Then, take a packet capture.
Turn off ECN
If you turn on explicit congestion notification (ECN), then it might cause packet loss or performance issues when you connect to Windows instances. To improve performance, turn off ECN.
To determine if ECN is turned on, run the following command:
netsh interface tcp show global
If ECN capability is turned on, then run the following command to turn it off:
netsh interface tcp set global ecncapability=disabled
Related information
How do I troubleshoot network performance issues between Amazon EC2 Linux instances in a VPC and an on-premises host over the internet gateway?
How can I determine whether my DNS queries to the Amazon provided DNS server are failing due to VPC DNS throttling?