Skip to content

CreateContainerError: failed to reserve container name

0

On any eks 1.31 periodically we meet pod in pending status with "CreateContainerError failed to reserve container name" error. The name is reserved for actual container id. For example: CreateContainer within sandbox returns container id "123". And then error: failed to reserve container name x, name is reserved for id "123". We tried al2023 and bottlerocket nodes, both could have same error. Also this problem affect only pods with init container (and looks like such condition is possible only for last init container). Have you any idea what it could be and what tools can help us to investigate this?

asked a year ago1.4K views
1 Answer
0
This issue you're experiencing with EKS 1.31 where pods are stuck in pending status with a "CreateContainerError: failed to reserve container name" error is not a commonly reported problem, but it does seem to be related to container runtime issues, particularly with init containers.

To investigate and potentially resolve this issue, you can try the following approaches:

1. Check the container runtime logs: Look for relevant entries in the container runtime logs (containerd or Docker logs, depending on your setup) that might provide more insight into the name reservation conflict. You can use commands like 'journalctl' or the specific container runtime logging mechanism to access these logs.

2. Examine node resources: Ensure that your nodes have sufficient resources (CPU, memory, storage) to handle the container creation process. Resource constraints can sometimes lead to unexpected behavior in container orchestration.

3. Review your init container configurations: Since the issue seems to affect pods with init containers, carefully review the specifications of your init containers. Ensure there are no conflicts in naming or resource allocation.

4. Update EKS and node components: If possible, try updating to the latest EKS patch version within the 1.31 release. Also, ensure that your node's kubelet, container runtime, and other components are up to date.

5. Analyze kubelet logs: Check the kubelet logs on the affected nodes for any error messages or warnings related to container creation or name reservation.

6. Monitor cluster events: Use 'kubectl get events' to watch for any cluster-wide events that might be related to this issue.

7. Consider using different container runtime: If you're using containerd, you might want to try switching to Docker as the container runtime, or vice versa, to see if the issue persists.

8. Reach out to AWS support: If the issue continues after trying these steps, it may be worth contacting AWS support. They might have more specific insights or be aware of any known issues with EKS 1.31 related to this problem.

Remember to test any changes in a non-production environment first. Careful monitoring and logging will be key to understanding and resolving this issue.
  

Sources
failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting cgroup config | AWS re:Post
EksAttemptContainerDetail - AWS Batch
Troubleshooting AWS Deep Learning Containers on EKS - AWS Deep Learning Containers

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.