I can't run kubectl commands. Also, I changed the endpoint access setting from public to private on my Amazon Elastic Kubernetes Service (Amazon EKS) cluster. Now, my cluster is stuck in the Failed state.
Resolution
Note: To set up access to the Kubernetes API server endpoint, see Modifying cluster endpoint access.
Troubleshoot kubectl command errors on a new or existing cluster
Confirm that your kubeconfig file connects to your cluster
Complete the following steps:
-
Confirm that you used the correct kubeconfig file to connect with your cluster. For more information, see Organizing cluster access using kubeconfig files on the Kubernetes website.
-
Run the following command to list the cluster contexts in your kubeconfig file:
kubectl config view -o jsonpath='{"Cluster name\tServer\n"}{range .clusters[*]}{.name}{"\t"}{.cluster.server}{"\n"}{end}'
Example output:
Cluster name Server "example-cluster-name" https://"example-cluster-endpoint".eks.amazonaws.com
-
Verify that the cluster names and endpoints in your kubeconfig file are correct. If a cluster’s name or endpoint is incorrect, then run the following command to update your cluster’s context in the kubeconfig file:
aws eks update-kubeconfig --name example-cluster-name --region example-region
Note: Replace example-cluster-name with the name of the cluster that you're updating and example-region with your AWS Region.
-
On your device, run the following telnet command on port 443 to check the API server endpoint connectivity:
telnet example-server-endpoint 443
Note: Replace example-server-endpoint with your API server endpoint.
In the following example output, port 433 can connect with server endpoint D8DC9092A7985668FF67C3D1C789A9F5.gr7.us-east-2.eks.amazonaws.com:
$ echo exit | telnet D8DC9092A7985668FF67C3D1C789A9F5.gr7.us-east-2.eks.amazonaws.com 443
Trying 18.224.160.210...
Connected to D8DC9092A7985668FF67C3D1C789A9F5.gr7.us-east-2.eks.amazonaws.com.
Escape character is '^]'.
Connection closed by foreign host.
If your device can't connect to the API server endpoint through port 433, then complete the resolution steps in the following sections.
Check the DNS resolver
Run the following command from the same device where the kubectl commands failed:
nslookup example-server-endpoint
Note: Replace example-server-endpoint with your API server endpoint.
Check whether you restricted public access to the API server endpoint
Take one of the following actions based on your requirements:
- If you restricted public API server endpoint access with CIDR blocks, then verify that your client machine's IP address falls within the allowed CIDR ranges.
- If you require access from outside your Amazon Virtual Private Cloud (Amazon VPC), then use a public endpoint with CIDR restrictions.
Make sure that your API server endpoint's access combination meets your access requirements.
Note: It's a best practice to set your API server endpoint to private and configure AWS VPN for external access. For more information, see Accessing a private only API server.
Troubleshoot kubectl command errors on a cluster after the endpoint access changes from public to private
Take the following actions:
- Confirm that you use a bastion host or connected networks to access the Amazon EKS API endpoint. Connected networks include peered VPCs, AWS Direct Connect, and VPNs.
Note: In private access mode, you can access the Amazon EKS API endpoint only from inside the cluster's VPC or connected networks.
- Check whether security groups or network access control lists (network ACLs) block requests to the Kubernetes API server.
Note: If you use a peered VPC, then confirm that the control plane security groups allow access from the peered VPC on port 443.
Troubleshoot a cluster that's stuck in the Failed state when you can't change the endpoint access setting from public to private
If there's a permissions issue with AWS Identity and Access Management (IAM), then your cluster enters the Failed state.
Confirm that the IAM role that you use is authorized to perform the AssociateVPCWithHostedZone action.
If the action isn't blocked, then check whether the your AWS account has AWS Organizations service control policies (SCPs) that block the API calls. Verify that no implicit or explicit denies block your IAM user's permissions at the organizational or account level. Deny statements block permissions even when the account administrator attaches the AdministratorAccess IAM policy with */* permissions to the user. AWS Organizations SCPs override the permissions for IAM entities.