Get Hands-on with Amazon EKS - Workshop Event Series
Whether you're taking your first steps with Kubernetes or you're an experienced practitioner looking to sharpen your skills, our Amazon EKS workshop series delivers practical, real-world experience that moves you forward. Learn directly from AWS solutions architects and EKS specialists through hands-on sessions designed to build your confidence with Kubernetes. Register now and start building with Amazon EKS!
How do I troubleshoot partial or intermittent DNS failures related to Amazon VPC?
I want to troubleshoot partial or intermittent DNS failures that are related to my Amazon Virtual Private Cloud (Amazon VPC) environments.
Resolution
The name servers are incorrect at the registrar
To correctly configure your name servers at the registrar, take the following actions.
Find your registered name servers
Complete the following steps:
- Log in to your domain registrar account.
- Find the listed name servers for your domain on your control panel or account dashboard.
- Use the ICANN Lookup tool to find the registrar information for your domain. For more information, see Use ICANN Lookup on the Google Cloud website. Or, run the whois example.com command.
Note: Replace example.com with your domain name.
You can also use command-line tools to find your registered name server. For example, on a Windows or macOS system, use the nslookup or dig -t NS commands to view the name servers. Make sure to look up the name servers that are associated with the domain name.
Compare name servers
Log in to your web hosting account to find the name servers from your hosting provider. Then, compare your domain's name servers with your hosting provider's name servers to make sure that they match.
Use online verification tools
In an online tool, enter your domain name to check whether the registered name servers are publicly resolving. For example, you can use DNS Checker on the DNS Checker website or DNS Lookup on the MxToolbox website.
Update incorrect settings
Resolve inconsistencies between your registrar's settings and your hosting provider's information on the registrar's platform.
The name severs are incorrect in the hosted zone
A partial DNS failure occurs when the resolver uses the incorrect name server to resolve the domain. This is because you updated or added a name server to the NS record.
To view your domain's active name servers, complete the following steps:
- Open your terminal.
- Run the dig -t NS yourdomain.com command.
Note: Replace yourdomain.com with your domain name. - Press Enter.
You can also use your cloud provider's web interface to view your hosted zone. For example, you can use Google Cloud DNS or Amazon Route 53.
To use a web interface, complete the following steps:
- Log in to your cloud provider's web interface.
- In the section for your domain's hosted zone, find the list of NS records.
- Check the records against the name servers that manage your domain.
For more information, see Update your domain's name servers on the Google Cloud website.
The resolvers in your configuration file are incorrect
If you don't correctly set the resolvers in your configuration file, then you can experience DNS issues. For example, an Amazon Elastic Compute Cloud (Amazon EC2) instance in an Amazon VPC uses the name servers that you defined in your configuration file. The EC2 instances use these name servers to resolve the domain.
To resolve this issue, update your configuration file with the correct resolvers. For Linux, the configuration file is /etc/resolv.conf. For Windows, it's %SystemRoot%\\system32\\dns.
An Amazon provided DNS server is throttling the DNS queries
Amazon provided DNS servers reject traffic that exceeds the quota of 1024 packets per second for each elastic network interface. The DNS throttles and intermittently times out.
To resolve this issue, take one of the following actions:
- Activate DNS caching on the instance.
- Increase the DNS retry timer on the application.
To further troubleshoot, see How do I determine whether my DNS queries to the Amazon DNS server fail because of VPC DNS throttling?
CoreDNS is throttling the DNS queries for Amazon EKS
The Amazon VPC resolver supports up to 1024 packets per second for each network interface.
To remain within the quota, use inter-pod anti-affinity with CoreDNS in Amazon Elastic Kubernetes Service (Amazon EKS). Inter-pod anti-affinity routes traffic through different network interfaces to reduce DNS queries for each interface.
Schedule CoreDNS Pods on separate nodes. To use inter-pod anti-affinity rules to schedule CoreDNS Pods on separate instances, add the following options to the CoreDNS deployment:
podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - podAffinityTerm: labelSelector: matchExpressions: - key: k8s-app operator: In values: - kube-dns topologyKey: kubernetes.io/hostname weight: 100
Note: For more information about inter-pod anti-affinity, see Inter-pod affinity and anti-affinity on the Kubernetes website.
If queries still throttle, then increase CoreDNS replicas or use NodeLocal DNSCache. For more information, see Using NodeLocal DNSCache in Kubernetes clusters on the Kubernetes website.
The domain URL resolves from the internet but not from the EC2 instance
DNS queries for your domain always resolve from the private hosted zone in the following scenarios:
- Your private hosted zone has the same name as your domain.
- You associated your VPC with the private hosted zone and configured your VPC DHCP options with AmazonProvidedDNS.
If the queried record for your domain isn't in the private hosted zone, then the DNS query fails and you receive an NXDOMAIN error. Also, the DNS resolver doesn't forward the DNS query to the public domain. Because the DNS record is in the public domain zone, it resolves from the internet. For more information, see Supported routing policies for records in a private hosted zone.
To resolve this issue, delete your private hosted zone or disassociate the VPC from your private hosted zone.
The DNS firewall rule in Route 53 is incorrect
If you incorrectly configured the DNS firewall rule in Route 53, then your domain doesn't resolve from a virtual private server (VPS). Instead, it resolves on the internet through a public resolver with 1.1.1.1 or 8.8.8.8 as the resolver IP address.
To correctly configure the DNS fire rule, use Route 53 Resolver DNS Firewall to filter outbound DNS traffic for your domain.
The Route 53 resolver endpoints are incorrect
To resolve this issue, take the following actions:
- Use resolver query logging to view the DNS queries that your endpoints handle. You can send the logs to Amazon CloudWatch Logs or an Amazon Simple Storage Service (Amazon S3) bucket.
- Verify network connectivity from the outbound endpoint to the destination DNS server's IP address. To test the connection, use dig or nslookup from a temporary EC2 instance in the same subnet as the endpoint.
- Check that the subnet's route table contains a route to the on-premises DNS server's IP address. The route is typically through a VPN or AWS Direct Connect connection.
- Confirm that outbound rules allow outbound TCP and UDP traffic on port 53 to the on-premises DNS server's IP address and port.
- Make sure that the network access control list (network ACL) that's associated with the endpoint subnets allows the correct outbound traffic. You must allow outbound TCP and UDP traffic to the on-premises DNS server on port 53. Also, the network ACL must allow inbound TCP and UDP traffic on the ephemeral port range (1024-65535) for the response.
- Verify that the outbound resolver rule has the correct IP address for the on-premises DNS server. An incorrect IP address causes queries to fail.
- Associate the resolver rule with the VPC that makes DNS queries.
- Make sure that your on-premises DNS server has a conditional forwarding rule. Configure the rule to send DNS queries for the domain to the inbound endpoint's IP addresses.
- Confirm that your on-premises DNS server sends recursive queries. The inbound endpoint doesn't support iterative queries.
- For private domain resolution, you must associate the private hosted zone with the VPC where the inbound endpoints are located.
- Activate the DNS resolution and DNS hostnames attributes for your VPC.
- Monitor the InboundQueryVolume and OutboundQueryVolume Route 53 Resolver metrics in CloudWatch to make sure that you don't exceed the queries per second (QPS) quotas. Each endpoint IP address can process up to 10,000 QPS over UDP.
Also, if you have multiple rules for the same domain or a conflict between a resolver rule and a private hosted zone, then the resolver rule takes precedence. Unexpected behavior can occur.
For more information, see How do I troubleshoot DNS resolution issues with Route 53 Resolver endpoints?
DNS failures on Linux
For a Linux-based operating system (OS), take the following actions to troubleshoot DNS failures.
To perform a lookup against the client DNS server that's configured in the host's /etc/resolv.conf file, run the following dig command:
dig www.amazon.com ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 13150 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;www.amazon.com. IN A ;; ANSWER SECTION: www.amazon.com. 41 IN A 54.239.17.6 ;; Query time: 1 msec ;; SERVER: 10.108.0.2#53(10.108.0.2) ;; WHEN: Fri Oct 21 21:43:11 2016 ;; MSG SIZE rcvd: 48
In the preceding example, the answer section shows that 54.239.17.6 is the IP address of the HTTP server for www.amazon.com.
To perform a recursive lookup of a DNS record, add the +trace variable to the following dig command:
dig +trace www.amazon.com ; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.37.rc1.49.amzn1 <<>> +trace www.amazon.com ;; global options: +cmd . 518400 IN NS J.ROOT-SERVERS.NET. . 518400 IN NS K.ROOT-SERVERS.NET. . 518400 IN NS L.ROOT-SERVERS.NET. ... ;; Received 508 bytes from 10.108.0.2#53(10.108.0.2) in 31 ms com. 172800 IN NS a.gtld-servers.net. com. 172800 IN NS b.gtld-servers.net. com. 172800 IN NS c.gtld-servers.net. ... ;; Received 492 bytes from 193.0.14.129#53(193.0.14.129) in 93 ms amazon.com. 172800 IN NS pdns1.ultradns.net. amazon.com. 172800 IN NS pdns6.ultradns.co.uk. ... ;; Received 289 bytes from 192.33.14.30#53(192.33.14.30) in 201 ms www.amazon.com. 900 IN NS ns-1019.awsdns-63.net. www.amazon.com. 900 IN NS ns-1568.awsdns-04.co.uk. www.amazon.com. 900 IN NS ns-277.awsdns-34.com. ... ;; Received 170 bytes from 204.74.108.1#53(204.74.108.1) in 87 ms www.amazon.com. 60 IN A 54.239.26.128 www.amazon.com. 1800 IN NS ns-1019.awsdns-63.net. www.amazon.com. 1800 IN NS ns-1178.awsdns-19.org. ... ;; Received 186 bytes from 205.251.195.251#53(205.251.195.251) in 7 ms
To perform a query that returns only the name servers, run the following dig command:
dig -t NS www.amazon.com; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.37.rc1.49.amzn1 <<>> -t NS www.amazon.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 48631 ;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;www.amazon.com. IN NS ;; ANSWER SECTION: www.amazon.com. 490 IN NS ns-1019.awsdns-63.net. www.amazon.com. 490 IN NS ns-1178.awsdns-19.org. www.amazon.com. 490 IN NS ns-1568.awsdns-04.co.uk. www.amazon.com. 490 IN NS ns-277.awsdns-34.com. ;; Query time: 0 msec ;; SERVER: 10.108.0.2#53(10.108.0.2) ;; WHEN: Fri Oct 21 21:48:20 2016 ;; MSG SIZE rcvd: 170
In the preceding example, www.amazon.com has the following authoritative name servers:
- ns-1019.awsdns-63.net
- ns-1178.awsdns-19.org
- ns-1568.awsdns-04.co.uk
- ns-277.awsdns-34.com
To authoritatively answer questions about the www.amazon.com host name, use the dig command to directly point to your name server. Then, check whether every authoritative name server for a domain answers correctly.
The following example output is for a query to www.amazon.com to one of its authoritative name servers ns-1019.awsdns-63.net. The server response shows that www.amazon.com is available on 54.239.25.192:
dig www.amazon.com @ns-1019.awsdns-63.net.; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.37.rc1.49.amzn1 <<>> www.amazon.com @ns-1019.awsdns-63.net.;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 31712 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 0 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;www.amazon.com. IN A ;; ANSWER SECTION: www.amazon.com. 60 IN A 54.239.25.192 ;; AUTHORITY SECTION: www.amazon.com. 1800 IN NS ns-1019.awsdns-63.net. www.amazon.com. 1800 IN NS ns-1178.awsdns-19.org. www.amazon.com. 1800 IN NS ns-1568.awsdns-04.co.uk. ... ;; Query time: 7 msec ;; SERVER: 205.251.195.251#53(205.251.195.251) ;; WHEN: Fri Oct 21 21:50:00 2016 ;; MSG SIZE rcvd: 186
The following line shows that ns-1019.awsdns-63.net is an authoritative name server for www.amazon.com:
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 0
The aa flag shows that the name server ns-1019.awsdns-63.net gave an authoritative answer for the resource record www.amazon.com.
DNS failures on Windows
For a Windows-based OS, take the following actions to troubleshoot DNS failures.
To return the IP address that's associated with a host name, use the nslookup utility:
C:\>nslookup www.amazon.comServer: ip-10-20-0-2.ec2.internalAddress: 10.20.0.2 Non-authoritative answer: Name: www.amazon.com Address: 54.239.25.192
To determine the authoritative name servers for a host name, use the -type=NS flag with the nslookup utility:
C:\>nslookup -type=NS www.amazon.comServer: ip-10-20-0-2.ec2.internalAddress: 10.20.0.2 Non-authoritative answer: www.amazon.com nameserver = ns-277.awsdns-34.com www.amazon.com nameserver = ns-1019.awsdns-63.net www.amazon.com nameserver = ns-1178.awsdns-19.org ...
To check whether ns-277.awsdns-34.com for www.amazon.com correctly responds to a request for www.amazon.com, use the following syntax:
C:\>nslookup www.amazon.com ns-277.awsdns-34.comServer: UnKnownAddress: 205.251.193.21 Name: www.amazon.com Address: 54.239.25.200
- Tags
- Amazon VPC
- Language
- English

Relevant content
- asked 4 years ago
AWS OFFICIALUpdated 7 months ago