Resolution
Prerequisites:
- Make sure that your Amazon EFS file system has a mount target in each of the worker node subnets for the Availability Zone.
- Confirm that you use efs.csi.aws.com for the Amazon EFS storage class definition.
- Verify that you use PersistentVolumeClaim and PersistentVolume.
Note: If you use dynamic provisioning, then you don't need to use PersistentVolumeClaim and PersistentVolume.
- Make sure that you installed the Amazon EFS CSI driver add-on in the EKS cluster.
Verify that you correctly configured the network from your EKS worker nodes to Amazon EFS API
Make sure that you have access to the Amazon EFS API from the EFS worker nodes and EFS controller pods.
If you don't configure the network to reach the Amazon EFS API, then you might receive one of the following error messages:
- "failed to provision volume with StorageClass "xxxx": rpc error: code = DeadlineExceeded desc = context deadline exceeded"
- "Could not start amazon-efs-mount-watchdog, unrecognized init system "bash" Mount attempt x/3 failed due to timeout after 15 sec"
- "Unable to attach or mount volumes: timed out waiting for the condition"
If you use a private cluster with no outbound internet access, then you must include the com.amazonaws.region.elasticfilesystem virtual private cloud (VPC) endpoint in your VPC. Create an inbound rule for the VPC endpoint's security group that allows traffic to port 443 from your worker nodes and pods subnets. Confirm that the policy that's attached to the VPC endpoint has the required permissions.
Verify that you correctly configured the Amazon EFS mount targets
Make sure that you created the Amazon EFS mount targets in each Availability Zone where the EKS nodes run. For example, if you distributed your worker nodes across us-east-1a and us-east-1b, then create mount targets in both Availability Zones for the EFS file system that you want to mount.
If you don't correctly configure the mount targets, then you might receive the following error message:
"Output: Failed to resolve "fs-xxxxxx.efs.us-east-1.amazonaws.com" - The file system mount target ip address cannot be found"
Verify that the security group associated with your Amazon EFS file system and worker nodes allows NFS traffic
If the security group doesn't allow traffic, then you might receive on of the following error messages:
- "Could not start amazon-efs-mount-watchdog, unrecognized init system "bash" Mount attempt x/3 failed due to timeout after 15 sec"
- "failed to provision volume with StorageClass "xxxx": rpc error: code = DeadlineExceeded desc = context deadline exceeded"
- "Unable to attach or mount volumes: timed out waiting for the condition"
The security group of the Amazon EFS file system must have an inbound rule that allows network file system (NFS) traffic from the Classless Inter-Domain Routing (CIDR) for your cluster's VPC. Allow port 2049 for inbound traffic.
The security group that's associated with the worker nodes where the pods fail to mount the EFS volume must have an outbound rule. The outbound rule must allow the NFS traffic from port 2049 to the EFS file system.
Verify that you created the subdirectory path in your Amazon EFS file system
If you add subdirectory paths in persistent volumes, then the Amazon EFS CSI driver doesn't create the subdirectory path in the file system. You must already have the subdirectory in the file system for the mount operation to succeed. If the subdirectory isn't in the file system, then you might receive the following error message:
"Output: mount.nfs4: mounting fs-18xxxxxx.efs.us-east-1.amazonaws.com:/path-in-dir:/ failed, reason given by server: No such file or directory"
To verify whether the subdirectory exists in the EFS file system, mount the EFS file system on an EC2 instance and list its contents. If the subdirectory doesn't exist, use the mkdir command to create it.
Confirm that the cluster's VPC uses the Amazon DNS server
When you mount the EFS volume with the Amazon EFS CSI driver, you must use the Amazon DNS server for the VPC.
Note: Only the Amazon provided DNS can resolve the Amazon EFS service's file system DNS.
To verify the DNS server, log in to the worker node, and then run the following command:
nslookup fs-4fxxxxxx.efs.region.amazonaws.com AMAZON_PROVIDED_DNS_IP
Note: Replace region with your AWS Region. Replace AMAZON_PROVIDED_DNS_IP with your DNS IP address.
If the custom DNS server doesn't forward the requests, then you might receive the following error message:
"Output: Failed to resolve "fs-xxxxxx.efs.us-west-2.amazonaws.com" - The file system mount target ip address cannot be found"
If the cluster VPC uses a custom DNS server, then configure this DNS server to forward all *.amazonaws.com requests to the Amazon DNS server.
Verify that you have iam mount options in the PersistentVolume definition when you use a restrictive file system policy
If you don't add the iam mount option with a restrictive file system policy, then the pods fail with the following error message:
"mount.nfs4: access denied by server while mounting 127.0.0.1:/"
If you configured the Amazon EFS file system to restrict mount permissions to specific AWS Identity and Access Management (IAM) roles, then use the -o iam mount. Include the spec.mountOptions property to allow the CSI driver to add the IAM mount option.
Example:
apiVersion: v1
kind: PersistentVolume
metadata:
name: efs-pv1
spec:
mountOptions:
- iam
Verify that you annotated the Amazon EFS CSI driver controller service account with the correct IAM role that has the required permissions
To verify that the service account that the efs-csi-controller pods use has the correct annotation, run the following command:
kubectl describe sa efs-csi-controller-sa -n kube-system
Example output:
eks.amazonaws.com/role-arn: arn:aws:iam::111122223333:role/AmazonEKS_EFS_CSI_DriverRole
To confirm that the account has the correct AWS Identity and Access Management (IAM) roles and permissions, verify the IAM OIDC provider for the cluster. Check that the IAM role that's associated with efs-csi-controller-sa service account has the required permissions to perform EFS API calls. Then, verify that the IAM role's trust policy trusts the service account efs-csi-controller-sa.
Example IAM role trust policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::111122223333:oidc-provider/oidc.eks.region-code.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringLike": {
"oidc.eks.region-code.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE:sub": "system:serviceaccount:kube-system:efs-csi-*",
"oidc.eks.region-code.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE:aud": "sts.amazonaws.com"
}
}
}
]
}
Verify that the EFS CSI driver pods are running
Run the following command to verify that these pods are active in your cluster:
kubectl get all -l app.kubernetes.io/name=aws-efs-csi-driver -n kube-system
Check that the EFS mount operation from the EC2 worker node where the pod fails to mount the file system
Log in to the Amazon EKS worker node of the pod. Then, use the EFS mount helper to manually mount the EFS file system to the worker node. To test the mount operation, run the following command:
sudo mount -t efs -o tls file-system-dns-name efs-mount-point/
If the worker node can mount the file system, then review the efs-plugin logs from the CSI controller and CSI node pods.
Check that the EFS CSI driver pod logs to determine the cause of the mount failures
If the volume fails to mount, then review the efs-plugin logs. To retrieve the efs-plugin container logs, run the following commands:
kubectl logs deployment/efs-csi-controller -n kube-system -c efs-plugin
kubectl logs daemonset/efs-csi-node -n kube-system -c efs-plugin