AWS EKS instance not found when attaching EBS volume

0

A few days ago attaching EBS volumes suddenly stopped working. My EKS cluster uses ebs.csi.aws.com addon with dynamic provisioning.

here is my storageClass config

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: ebs-sc
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

and volumeClaimTemplate in my sts config

  volumeClaimTemplates:
  - metadata:
      name: log
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi

after sts deployment a PVC, PV and VolumeAttachment are created, however the pod is stuck in ContainerCreating state with error AttachVolume.Attach failed for volume "pvc-xxx" : rpc error: code = NotFound desc = Instance "i-xxx" not found

I triple-checked, the volume is not attached to any other instance, and the instance exists.

One funny thing though - when I describe the created PV I see this

Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            ebs.csi.aws.com
    FSType:            ext4
    VolumeHandle:      vol-xxx
    ReadOnly:          false
    VolumeAttributes:      storage.kubernetes.io/csiProvisionerIdentity=xxx-8081-ebs.csi.aws.com

the (unmasked) volumeHandle does not even exist.

Where might be the problem? As I said earlier, this issue popped up from day to day without changing the config

K8S version 1.24 EBS CSI Driver addon version v1.11.5-eksbuild.2 (upgrade nor downgrade didn't help)

Thanks

1 Risposta
2

When you use EBS for persistent volumes, you need remebmer, that EBS is located in a single AZ, so only EC2 instance from the same AZ will be able to attach it. If a pod is rescheduled to a node in another AZ, it may fail with an error that it can not find/attach a persistent volume. Every node can have a label with its AZ, so you can use nodeSelector or Affinity to make pod be scheduled only in particular AZ

profile picture
ESPERTO
con risposta un anno fa
profile picture
ESPERTO
Artem
verificato un mese fa
  • I believe the volume binding mode WaitForFirstConsumer should prevent AZ mismatch.

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande