By using AWS re:Post, you agree to the AWS re:Post Terms of Use

Why does my EKS cluster fails init on "Nodegroup... failed to stabilize"?

0

I launched a cluster using the "Getting Started" https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html docs as below. It fails each time on ManagedNodeGroup: CREATE_FAILED – "Nodegroup standard-workers failed to stabilize: Internal Failure".

The same thing happens from the AWS SDK for Go: https://docs.aws.amazon.com/sdk-for-go/api/service/eks/#EKS.CreateCluster

How can I diagnose and overcome this?

 eksctl create cluster \
--name mycluster \
--region us-west-2 \
--nodegroup-name standard-workers \
--node-type t3.medium \
--nodes 1 \
--nodes-min 1 \
--nodes-max 1 \
--ssh-access \
--ssh-public-key joshuakey.pub  \
--managed
[ℹ]  eksctl version 0.11.1
[ℹ]  using region us-west-2
[ℹ]  setting availability zones to [us-west-2a us-west-2d us-west-2c]
[ℹ]  subnets for us-west-2a - public:192.168.0.0/19 private:192.168.96.0/19
[ℹ]  subnets for us-west-2d - public:192.168.32.0/19 private:192.168.128.0/19
[ℹ]  subnets for us-west-2c - public:192.168.64.0/19 private:192.168.160.0/19
[ℹ]  using SSH public key "joshuakey.pub" as "eksctl-mycluster-nodegroup-standard-workers-f5:cd:16:16:31:34:89:4f:10:ea:5b:74:85:e5:9b:2b"
[ℹ]  using Kubernetes version 1.14
[ℹ]  creating EKS cluster "mycluster" in "us-west-2" region with managed nodes
[ℹ]  will create 2 separate CloudFormation stacks for cluster itself and the initial managed nodegroup
[ℹ]  if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=us-west-2 --cluster=mycluster'
[ℹ]  CloudWatch logging will not be enabled for cluster "mycluster" in "us-west-2"
[ℹ]  you can enable it with 'eksctl utils update-cluster-logging --region=us-west-2 --cluster=mycluster'
[ℹ]  Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster "mycluster" in "us-west-2"
[ℹ]  2 sequential tasks: { create cluster control plane "mycluster", create managed nodegroup "standard-workers" }
[ℹ]  building cluster stack "eksctl-mycluster-cluster"
[ℹ]  deploying stack "eksctl-mycluster-cluster"
[ℹ]  building managed nodegroup stack "eksctl-mycluster-nodegroup-standard-workers"
[ℹ]  deploying stack "eksctl-mycluster-nodegroup-standard-workers"
[✖]  unexpected status "ROLLBACK_IN_PROGRESS" while waiting for CloudFormation stack "eksctl-mycluster-nodegroup-standard-workers"
[ℹ]  fetching stack events in attempt to troubleshoot the root cause of the failure
[✖]  AWS::EKS::Nodegroup/ManagedNodeGroup: CREATE_FAILED – "Nodegroup standard-workers failed to stabilize: Internal Failure"
[ℹ]  1 error(s) occurred and cluster hasn't been created properly, you may wish to check CloudFormation console
[ℹ]  to cleanup resources, run 'eksctl delete cluster --region=us-west-2 --name=mycluster'
[✖]  waiting for CloudFormation stack "eksctl-mycluster-nodegroup-standard-workers": ResourceNotReady: failed waiting for successful resource state
[✖]  failed to create cluster "mycluster"

Edited by: JoshuaFox on Apr 28, 2020 4:51 AM

Edited by: JoshuaFox on Apr 28, 2020 7:02 AM

asked 5 years ago1.8K views
4 Answers
0

Same problem here:

[ℹ]  deploying stack "eksctl-Schulungen-nodegroup-Roboters"
[✖]  unexpected status "ROLLBACK_IN_PROGRESS" while waiting for CloudFormation stack "eksctl-Schulungen-nodegroup-Roboters"
[ℹ]  fetching stack events in attempt to troubleshoot the root cause of the failure
[✖]  AWS::EKS::Nodegroup/ManagedNodeGroup: CREATE_FAILED – "Nodegroup Roboters failed to stabilize: Internal Failure"
[ℹ]  1 error(s) occurred and cluster hasn't been created properly, you may wish to check CloudFormation console
[ℹ]  to cleanup resources, run 'eksctl delete cluster --region=us-east-2 --name=Schulungen'
[✖]  waiting for CloudFormation stack "eksctl-Schulungen-nodegroup-Roboters": ResourceNotReady: failed waiting for successful resource state
Error: failed to create cluster

Any Ideas?

answered 5 years ago
0

Hint found in https://www.talkingquickly.co.uk/2020/04/nodegroup-failed-to-stabilize-internal-failure/

The problem stated on the above mentioned website was tackled in version 0.17.0 of eksctl. Today I downloaded 0.18.0 and the process worked. Problem fixed!

Edited by: klicki on May 4, 2020 4:57 AM

answered 5 years ago
0

Same issue with the EKS QS Template for existing VPC:
https://raw.githubusercontent.com/aws-quickstart/quickstart-amazon-eks/master/templates/amazon-eks-master-existing-vpc.template.yaml

EKSNodegroup CREATE_FAILED Nodegroup EKSNodegroup-xxxxxxxxxx failed to stabilize: Internal Failure

answered 5 years ago
0

Thank you @klicki. Upgrading eksctl was all it took

answered 5 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions