NodeCreationFailure Unable to join node group

0

Hi all!

!!Disclaimer!! I am a beginner when it comes to Kubernetes/EKS and I try and fallow as much of AWS documentation on how to do things. If im doing something stupid, forgot something or there is a better way to do it. PLZ say it!

I need help with the fallowing error: NodeCreationFailure Couldn't proceed with upgrade process as new nodes are not joining node group EKS-%NODENAME%

This is what I have done:

I have AWS account where I run few Kubernetes clusters in EKS with managed nodes. Those clusters are pretty bassic and have been set up with AWS starter guide. The only thing I had to modify to suite my needs was to edit bootstrap with max Pods. I then created a launch template from this node because I did not wanted to go through all the steps to increase the max pod number. In the launch template I removed everything or set to not include in template. The only thing that the template includes is Sec. Group ID, Storage (EBS dive size) and bootstrap edit with max pod set to 110.

But now when I want to update my nodes because there is new AMI version or new K8s version I get the above mentioned error. I also have a node group with out launch template and I can update that without a problem.

What have I done to troubleshoot this:

  1. Verified my VPC set up
  2. verified all roll and policies
  3. Created new Launch template
  4. Did the mandatory Google search
  5. Created a brand new set up (with new VPC, created new cluster + Roles)
  6. Created new Launch template

In the end I still have the same problem, but now I think the problem is my Launch template. Yet I have no idea why or what to do next.

Aiv
asked 20 days ago293 views
1 Answer
0

Hi there,

If I understand you correctly, you are having issues updating a managed node group that was created with launch templates.

When you created the launch template from the existing managed node, did you use that launch template to create a new node group? If you did, you can update it with a different version of the same launch template. If you're upgrading a node group that's deployed with a launch template to a new launch template version, you will need to use the Launch template version that you want to update the node group to. If your node group is configured with a custom AMI, then the version that you select must also specify an AMI. Note: You can't directly upgrade a node group that's deployed without a launch template to a new launch template version. Instead, you must deploy a new node group using the launch template to update the node group to a new launch template version [1].

Customizing managed nodes with launch templates - refer to this doc

Lastly, you can run the AWSSupport-TroubleshootEKSWorkerNode runbook, it analyzes an Amazon Elastic Compute Cloud (Amazon EC2) worker node and Amazon Elastic Kubernetes Service (Amazon EKS) cluster to help you identify and troubleshoot common causes that prevent worker nodes from joining a cluster.

Reference [1] - https://docs.aws.amazon.com/eks/latest/userguide/update-managed-node-group.html

AWS
Olawale
answered 20 days ago
  • Yes, I created template from existing managed node and then launched a new node group using that template. Now Im trying to update a node group that was created with a launch template. I have ran this AWSSupport-TroubleshootEKSWorkerNode runbook before and no errors showed up. I will run it again.

    How do I know if im using custom AMI? In my launch templete I have no AMI specified in instance details. For example my current clusters controll plane is on 1.28 and the new node group I had to create says im using AMI 1.28.8-20240506.

  • Q: How do I know if i'm using custom AMI?

    A: when you specify an AMI ID in the ImageId field of your launch template.

  • Unfortunatelly still no luck, ran the runbook and it passed all the checks on multiple nodes.

    My launch template only contains the fallowing 2 things an nothing else: Security group ID and bootstrap edit /etc/eks/bootstrap.sh ACC_EKS_POC_Cluster --kubelet-extra-args '--node-labels=eks.amazonaws.com/capacityType=ON_DEMAND,eks.amazonaws.com/nodegroup=EKS-ACC-Test --max-pods=110' --b64-cluster-ca $B64_CLUSTER_CA --apiserver-endpoint $API_SERVER_URL --dns-cluster-ip $K8S_CLUSTER_DNS_IP --use-max-pods false

    By the way I have this exact issue on 3 clusters. One of which is not even 2 weeks old.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions