- Newest
- Most votes
- Most comments
Hi @abhinav,
Confirm that your control plane's security group and worker node security group are configured with the recommended settings for inbound and outbound traffic. Also, confirm that your custom network ACL rules are configured to allow traffic to and from "0.0.0.0/0" for ports 80,443 and 1025-65535.
Refer- https://aws.amazon.com/premiumsupport/knowledge-center/eks-worker-nodes-cluster/
If the Answer is helpful, please click Accept Answer
& UPVOTE
, this can be beneficial to other community members.
I've tested this with EKS 1.22 and 1.23, and was able to reproduce. Besides the node group health issues surfaced in the console and via the below CLI command, I saw no actual degradation of the nodes.
aws eks describe-nodegroup --cluster-name <CLUSTER_NAME> --nodegroup-name <NG_NAME>
...
"health": {
"issues": [
{
"code": "AccessDenied",
"message": "Your worker nodes do not have access to the cluster. Verify if the node instance role is present and correctly configured in the aws-auth ConfigMap.",
"resourceIds": [
"eksctl-<CLUSTER_NAME>-nodegroup-<NG_NAME>-<NODE_INSTANCE_ROLE>"
]
}
]
},
I think this error message is benign. While the console reports an unhealthy node group, the individual nodes show as healthy.
kubectl get nodes -oyaml | grep conditions -A 30
yeah all the working is good. Just the EKS dashboard shows unhealthy. Due to that we were not able to update the cluster and also was not able to run Terraform code because it gave error that cluster is unhealthy. I removed the
system:masters
from EKS worker node role and gave access to jenkins agent pods via service account bound to cluster-admin role via ClusterRoleBindinga and things worked fine and these errors were gone.How were you able to reproduce this error?
HI!
This could be perhaps a simple answer, but I think the problem could be the missing dash in the pipe, -
Correct:
data:
mapRoles: |-
Incorrect:
data:
mapRoles: |
See? missing dash, this basically is explained by "Block Chomping Indicator"
Block Chomping Indicator: The chomping indicator controls what should happen with newlines at the end of the string. The default, clip, puts a single newline at the end of the string. To remove all newlines, strip them by putting a minus sign (-) after the style indicator. Both clip and strip ignore how many newlines are actually at the end of the block; to keep them all put a plus sign (+) after the style indicator. https://yaml-multiline.info/
When this is applied to Kubernetes:
[~]$ k -n kube-system get cm aws-auth-test -oyaml
apiVersion: v1
data:
mapRoles: |-
- groups:
- system:bootstrappers
- system:nodes
- system:masters
rolearn: myrole
username: system:node:value
kind: ConfigMap
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","data":{"mapRoles":"- groups:\n - system:bootstrappers\n - system:nodes\n - system:masters\n rolearn: myrole\n username: system:node:value"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"aws-auth-test","namespace":"kube-system"}}
creationTimestamp: "2023-02-18T05:16:35Z"
name: aws-auth-test
namespace: kube-system
resourceVersion: "1947"
uid: eb241efd-a31a-45e8-b0fc-9f486ef5fff1
[~]$ k -n kube-system get cm aws-auth-test-bad -oyaml
apiVersion: v1
data:
mapRoles: |
- groups:
- system:bootstrappers
- system:nodes
- system:masters
rolearn: myrole
username: system:node:value
kind: ConfigMap
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","data":{"mapRoles":"- groups:\n - system:bootstrappers\n - system:nodes\n - system:masters\n rolearn: myrole\n username: system:node:value\n"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"aws-auth-test-bad","namespace":"kube-system"}}
creationTimestamp: "2023-02-18T05:19:12Z"
name: aws-auth-test-bad
namespace: kube-system
resourceVersion: "1990"
uid: 1beebe54-58d5-4515-808c-b8333b1aec0b
It seems valid, but now, there is a line break at the end of the role: system:node:value\n
under the annotation kubectl.kubernetes.io/last-applied-configuration, and maybe that's what Console is retrieving. Its an educated guess, I haven't tested under any AWS Cluster, but you might want to give it a try.
If you run the same test you will notice a blank space on the yaml if retrieved:
# bad one with break at the end
[~]$ k -n kube-system get cm aws-auth-test-bad -oyaml | yq e '.data.mapRoles'
- groups:
- system:bootstrappers
- system:nodes
- system:masters
rolearn: myrole
username: system:node:value
#<----- this is line break from the bad configmap
# good one, no break at the end.
[~]$ k -n kube-system get cm aws-auth-test -oyaml | yq e '.data.mapRoles'
- groups:
- system:bootstrappers
- system:nodes
- system:masters
rolearn: myrole
username: system:node:value
Hope this helps!
Relevant content
- asked 2 years ago
- asked 2 years ago
- asked 5 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 3 months ago
It's all fine. Whenever I remove the line
system:masters
everything works fine. Something wrong with this line.I removed the
system:masters
from EKS worker node role and gave access to jenkins agent pods via service account bound to cluster-admin role via ClusterRoleBinding and the error was gone.