CloudFormation stack for EKS cluster cannot update AWS::EKS::Addon on adding tags

0

We just updated a CDK stack that contains an EKS Cluster construct with coredns, vpc-cni, and kube-proxy addons to CDK 2.88.0 and that deployed just fine. Subsequently we applied some tags to the entire stack.

foreach (var tag in stackTags)
{
	Amazon.CDK.Tags.Of(stack).Add(tag.Key, tag.Value);
}

On cdk deploy however, some errors appeared

Failed resources:
shared-eks-stack-dev | 8:01:23 pm | UPDATE_FAILED        | AWS::EKS::Addon                           | shared-eks-dev/vpcCni (sharedeksdevvpcCniEACA28E9) Resource handler returned message: "Addon moved to failed status during Update operation.
Code: ConfigurationConflict, Message: Conflicts found when trying to apply. Will not continue due to resolve conflicts mode. Conflicts:
DaemonSet.apps aws-node - .spec.template.spec.initContainers[name="aws-vpc-cni-init"].env[name="DISABLE_TCP_EARLY_DEMUX"].value
DaemonSet.apps aws-node - .spec.template.spec.containers[name="aws-node"].env[name="ENABLE_POD_ENI"].value, ResourceIds: []
" (RequestToken: 106e9084-2a07-ac69-de0b-efe74a249732, HandlerErrorCode: NotStabilized)

shared-eks-stack-dev | 8:03:11 pm | UPDATE_FAILED        | AWS::EKS::Addon                           | shared-eks-dev/vpcCni (sharedeksdevvpcCniEACA28E9) Resource handler returned message: "Addon moved to failed status during Update operation.
Code: ConfigurationConflict, Message: Conflicts found when trying to apply. Will not continue due to resolve conflicts mode. Conflicts:
DaemonSet.apps aws-node - .spec.template.spec.containers[name="aws-node"].env[name="ENABLE_POD_ENI"].value
DaemonSet.apps aws-node - .spec.template.spec.initContainers[name="aws-vpc-cni-init"].env[name="DISABLE_TCP_EARLY_DEMUX"].value, ResourceIds: []
" (RequestToken: acfb9b9d-d0a9-1c3c-9ddb-7bc04a899cc8, HandlerErrorCode: NotStabilized)

Now the stack is stuck in UPDATE_ROLLBACK_FAILED state while the nested stacks stuck in UPDATE_ROLLBACK_COMPLETE_CLEANUP_IN_PROGRESS. In the nested stacks all resources appear to be in the proper UPDATE_COMPLETE state and there's nothing to rollback.

How should these be resolved or properly rolled back to a working state?

The resources noted to be problematic in the Continue update rollback pop-up are

  • One of the KubectlProvider nested stacks
  • coredns
  • kube-proxy
  • autoscaler service account role
  • vpc-cni
1 Answer
1
Accepted Answer

Checking with my colleague, he had edited the vpc-cni addon out-of-band (to let pods operate with security groups). The original CDK code instantiated the CfnAddon constructs without a ResolveConflcts mode, so it defaulted to NONE. And NONE will make the addon "resist" changes coming from CDK/CloudFormation thus the resource update error.

To break out of this failed state, the vpc-cni addon has to be manually edited in the EKS console Add-ons page, setting its Conflict resolution method to Overwrite in order to let the stack perform the rollback properly (UPDATE_ROLLBACK_COMPLETE).

The next stack update then has to include a change to the CfnAddon constructs with ResolveConflicts = PRESERVE.

var coreDns = new CfnAddon(this, "coreDns", new CfnAddonProps
{
	AddonName = "coredns",
	ClusterName = this.Cluster.ClusterName,
	ResolveConflicts = "PRESERVE"
});

var vpcCni = new CfnAddon(this, "vpcCni", new CfnAddonProps
{
	AddonName = "vpc-cni",
	ClusterName = this.Cluster.ClusterName,
	ResolveConflicts = "PRESERVE"
});

var kubeProxy = new CfnAddon(this, "kubeProxy", new CfnAddonProps
{
	AddonName = "kube-proxy",
	ClusterName = this.Cluster.ClusterName,
	ResolveConflicts = "PRESERVE"
});

This way future updates to the stack won't hit the same roadblock.

icelava
answered 9 months ago
profile picture
EXPERT
reviewed 5 days ago
profile picture
EXPERT
reviewed a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions