I want to find the top contributors of traffic through the NAT gateway in my Amazon Virtual Private Cloud (Amazon VPC).
Short description
To find the top traffic drivers of the NAT gateway in your Amazon VPC, use Amazon CloudWatch metrics to identify the time of traffic spikes.
Then, use one of the following methods to identify the Amazon VPC instances that cause traffic spikes:
- If your Amazon VPC flow logs publish data to CloudWatch, then use CloudWatch Logs.
- If your Amazon VPC flow logs publish data to Amazon Simple Storage Service (Amazon S3), then use Amazon Athena.
Resolution
Prerequisite: Check that you activated Amazon VPC flow logs for your Amazon VPC or NAT gateway elastic network interface. If you didn't turn on Amazon VPC flow logs, then create flow logs that publish to Amazon CloudWatch Logs or Amazon S3.
Use Amazon CloudWatch to view your NAT gateway’s metrics
Use the following Amazon CloudWatch metrics for your NAT gateway metrics to identify time frames that correspond with traffic spikes:
- Use BytesInFromSource to indicate traffic that uploads to instances in your NAT gateway.
- Use BytesInFromDestination to indicate traffic that downloads from instances in your NAT gateway.
Then, query your Amazon VPC flow logs for the instances that experience high traffic during the time frames that you identified. You can use Amazon CloudWatch Logs Insights or Amazon Athena to query your Amazon VPC flow logs. For more information, see Query Amazon VPC flow logs.
Use CloudWatch Logs Insights to identify the instances that cause traffic spikes
If your Amazon VPC flow logs publish data to CloudWatch, then use Cloud Watch Logs Insights to query your Amazon VPC flow logs. In the Select log group(s) drop down, choose your NAT gateway’s log group. Then, choose Custom to set the time range that corresponds with the traffic spikes that you identified.
Use one of the following queries to display instances that cause traffic spikes based on your use case. You can also use a CloudFormation template to create a CloudWatch Dashboard that incorporates the following queries. For a sample CloudFormation Template, see aws-cloudformation-templates on the Github website.
Note: In the following queries, replace example-NAT-private-IP with your NAT gateway private IP address or IP addresses. For single IP queries, use the primary private IP address of your NAT gateway. Replace example-VPC-CIDR with the CIDR range of your Amazon VPC.
To identify instances that send the most traffic through your NAT gateway, run the following query:
filter (dstAddr in ["example-NAT-private-IP", "example-NAT-private-IP", "example-NAT-private-IP"] AND isIpv4InSubnet(srcAddr, "example-VPC-CIDR"))
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
To identify traffic that goes to and from your instances, run the following query:
filter (dstAddr in ["example-NAT-private-IP", "example-NAT-private-IP", "example-NAT-private-IP"] AND isIpv4InSubnet(srcAddr, "example-VPC-CIDR"))
OR (srcAddr in ["example-NAT-private-IP", "example-NAT-private-IP", "example-NAT-private-IP"] AND isIpv4InSubnet(dstAddr, "example-VPC-CIDR"))
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
To identify the internet destinations that your instances access most frequently, run the following queries:
-
For uploads through a NAT gateway:
filter (srcAddr in ["example-NAT-private-IP", "example-NAT-private-IP", "example-NAT-private-IP"] AND not isIpv4InSubnet(dstAddr, "example-VPC-CIDR"))
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
| limit 10
-
For downloads through a NAT gateway:
filter (dstAddr in ["example-NAT-private-IP", "example-NAT-private-IP", "example-NAT-private-IP"] AND not isIpv4InSubnet(srcAddr, "example-VPC-CIDR"))
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
| limit 10
To identify instances that send the most traffic through your NAT gateway to internet destinations, run the following queries:
-
For uploads through a NAT gateway:
parse @message "* * * * * * * * * * * * * * * " as version, account_id, interface_id, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action, log_status, pkt_srcaddr, pkt_dstaddr| filter (dstaddr like 'example-NAT-private-IP' and isIpv4InSubnet(pkt_srcaddr, 'example-VPC-CIDR'))
| stats sum(bytes) as bytesTransferred by pkt_srcaddr, pkt_dstaddr
| sort bytesTransferred desc
| limit 10
-
For downloads through a NAT gateway:
parse @message " * * * * * * * * * * * * * * *" as version, account_id, interface_id, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action, log_status, pkt_srcaddr, pkt_dstaddr| filter (srcaddr like 'example-NAT-private-IP' and !isIpv4InSubnet(pkt_srcaddr, 'example-VPC-CIDR'))
| stats sum(bytes) as bytesTransferred by pkt_srcaddr, pkt_dstaddr
| sort bytesTransferred desc
| limit 10
To identify the instances that communicate with internet destinations, use custom VPC flow logs. Make sure that your VPC flow logs include the pkt-srcaddr and pkt-dstaddr fields. For more information, see Traffic through a NAT gateway.
Use Amazon Athena to identify the instances that cause traffic spikes
If your Amazon VPC flow logs publish data to Amazon S3, then use Amazon Athena to create a table for your flow logs. Make sure that you add the following filters to your table to set the time range that corresponds with the traffic spikes that you identified:
start>= (example-timestamp-start)
end>= (example-timestamp-end)
Note: Replace example-timestamp-start with the start of the time frame that corresponds with your traffic spike. Replace example-timestamp-end with the end of the time frame that corresponds with your traffic spike.
Then, query your table for instances that cause traffic spikes.
Note: In the following queries, replace example-NAT-private-IP with your NAT gateway private IP address. Replace example-VPC-CIDR with the CIDR range of your Amazon VPC. Replace example-database-name.example-table-name, with your database and table names. Replace example-octets with the first two octets of the CIDR range of your Amazon VPC. For example, if your CIDR range is 10.24.34.0/23, then replace example-octets with 10.24.
To identify instances that send the most traffic through your NAT gateway, run the following query:
SELECT srcaddr,dstaddr,sum(bytes) FROM example-database-name.example-table-name WHERE srcaddr like example-octets AND dstaddr like example-NAT-private-IP group by 1,2 order by 3 desclimit 10;
To identify traffic that goes to and from your instances, run the following query:
SELECT srcaddr,dstaddr,sum(bytes) FROM example-database-name.example-table-name WHERE (srcaddr like example-octets AND dstaddr like example-NAT-private-IP) or (srcaddr like example-NAT-private-IP AND dstaddr like example-octets) group by 1,2 order by 3 desclimit 10;
To identify the internet destinations that your instances access most frequently, run the following queries:
-
For uploads through a NAT gateway:
SELECT srcaddr,dstaddr,sum(bytes) FROM example-database-name.example-table-name WHERE (srcaddr like example-NAT-private-IP AND dstaddr not like example-octets) group by 1,2 order by 3 desclimit 10;0
-
For downloads through a NAT gateway:
SELECT srcaddr,dstaddr,sum(bytes) FROM example-database-name.example-table-name WHERE (srcaddr not like example-octets AND dstaddr like example-NAT-private-IP) group by 1,2 order by 3 desclimit 10;
Related information
Sample queries
Boolean, comparison, numeric, datetime, and other functions
How do I use Athena to analyze VPC flow logs?
Using AWS Cost Explorer to analyze data transfer costs