使用AWS re:Post即您表示您同意 AWS re:Post 使用条款

如何解决 Amazon EKS 中的容器组状态 ErrImagePull 和 ImagePullBackoff 错误?

4 分钟阅读
0

我的 Amazon Elastic Kubernetes Service(Amazon EKS)容器组(pod)状态为 ErrImagePull 或 ImagePullBackoff 状态。

简短描述

如果您运行 kubectl 命令获取容器组(pod) 并且您的容器组处于 ImagePullBackOff 状态,则这些容器组无法正常运行。ImagePullBackoff 状态表示容器由于无法检索或提取图像而无法启动。要解决此问题,请使用以下解决方案。

有关更多信息,请参阅 Amazon EKS Connector Pods 处于 ImagePullBackOff 状态

解决方法

确认图像信息

使用以下步骤确认容器组状态错误消息,并验证图像名称、标签和安全哈希算法(SHA)是否正确:

  1. 要获取容器组状态,请运行以下命令:

    $ kubectl get pods -n defaultNAME                              READY   STATUS             RESTARTS   AGE
    nginx-7cdbb5f49f-2p6p2            0/1     ImagePullBackOff   0          86s
  2. 要获取容器组故障的详细信息,请运行以下命令:

    $ kubectl describe pod nginx-7cdbb5f49f-2p6p2
    ...
    Events:
      Type     Reason     Age                   From               Message
      ----     ------     ----                  ----               -------
      Normal   Scheduled  4m23s                 default-scheduler  Successfully assigned default/nginx-7cdbb5f49f-2p6p2 to ip-192-168-149-143.us-east-2.compute.internal
      Normal   Pulling    2m44s (x4 over 4m9s)  kubelet            Pulling image "nginxx:latest"
      Warning  Failed     2m43s (x4 over 4m9s)  kubelet            Failed to pull image "nginxx:latest": rpc error: code = Unknown desc = Error response from daemon: pull access denied for nginxx, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
      Warning  Failed     2m43s (x4 over 4m9s)  kubelet            Error: ErrImagePull
      Warning  Failed     2m32s (x6 over 4m8s)  kubelet            Error: ImagePullBackOff
      Normal   BackOff    2m17s (x7 over 4m8s)  kubelet            Back-off pulling image "nginxx:latest"
  3. 确认您的图像标签和名称存在且正确无误。

  4. 如果图像注册表需要身份验证,请确认您有权访问它。要验证在容器组中使用的图像是否正确,请运行以下命令:

    $ kubectl get pods nginx-7cdbb5f49f-2p6p2  -o jsonpath="{.spec.containers[*].image}" | \sort
    nginx:latest

要了解容器组状态值,请参阅 Kubernetes 网站上的 Pod phase我如何排查 Amazon EKS 中的容器组(pod)状态问题?

对私有注册表进行故障排除

如果您使用 Amazon EKS 从私有注册表检索图像,则可能需要额外的配置。使用工作负载清单上的 imagePullSecrets 来指定凭证。这些凭证通过私有注册表进行身份验证。这允许容器组从指定的私有存储库中提取图像。

要查看密钥的内容,请使用以下命令在 YAML 中查看密钥:

kubectl get secret <secret_name> --output=yaml

在以下示例中,一个容器组需要在 regcred 中访问你的 Docker 注册表凭证:

apiVersion: v1
kind: Pod
metadata:
  name: private-reg
spec:
  containers:
  - name: private-reg-container
    image: your-private-image
  imagePullSecrets:
  - name: regcred

your-private-image 替换为私有注册表中图像的路径,如下所示:

your.private.registry.example.com/bob/bob-private:v1

要从私有注册表中提取图像,Kubernetes 需要凭证。配置文件中的 imagePullSecrets 字段指定 Kubernetes 必须从名为 regcred 的秘密获取凭证。

有关更多信息,请参阅 Kubernetes 网站上的 Pull an Image from a Private Registry

解决其他注册表问题

无法提取图像问题

错误“Failed to pull image...”意味着 kubelet 尝试连接到私有注册表端点,但由于连接超时而失败。

在以下示例中,无法访问注册表,因为 kubelet 无法访问私有注册表端点:

$ kubectl describe pods nginx-9cc69448d-vgm4m
...
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  16m                default-scheduler  Successfully assigned default/nginx-9cc69448d-vgm4m to ip-192-168-149-143.us-east-2.compute.internal
  Normal   Pulling    15m (x3 over 16m)  kubelet            Pulling image "nginx:stable"
  Warning  Failed     15m (x3 over 16m)  kubelet            Failed to pull image "nginx:stable": rpc error: code = Unknown desc = Error response from daemon: Get "https://registry-1.docker.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
  Warning  Failed     15m (x3 over 16m)  kubelet            Error: ErrImagePull
  Normal   BackOff    14m (x4 over 16m)  kubelet            Back-off pulling image "nginx:stable"
  Warning  Failed     14m (x4 over 16m)  kubelet            Error: ImagePullBackOff

要解决此错误,请检查您的子网、安全组和允许与注册表端点通信的网络 ACL。

已超过注册速率限制

在以下示例中,已超过注册表速率限制:

$ kubectl describe pod nginx-6bf9f7cf5d-22q48
...
Events:
  Type     Reason                  Age                   From               Message
  ----     ------                  ----                  ----               -------
  Normal   Scheduled               3m54s                 default-scheduler  Successfully assigned default/nginx-6bf9f7cf5d-22q48 to ip-192-168-153-54.us-east-2.compute.internal
  Warning  FailedCreatePodSandBox  3m33s                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "82065dea585e8428eaf9df89936653b5ef12b53bef7f83baddb22edc59cd562a" network for pod "nginx-6bf9f7cf5d-22q48": networkPlugin cni failed to set up pod "nginx-6bf9f7cf5d-22q48_default" network: add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  2m53s                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "20f2e27ba6d813ffc754a12a1444aa20d552cc9d665f4fe5506b02a4fb53db36" network for pod "nginx-6bf9f7cf5d-22q48": networkPlugin cni failed to set up pod "nginx-6bf9f7cf5d-22q48_default" network: add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  2m35s                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "d9b7e98187e84fed907ff882279bf16223bf5ed0176b03dff3b860ca9a7d5e03" network for pod "nginx-6bf9f7cf5d-22q48": networkPlugin cni failed to set up pod "nginx-6bf9f7cf5d-22q48_default" network: add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  2m                    kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "c02c8b65d7d49c94aadd396cb57031d6df5e718ab629237cdea63d2185dbbfb0" network for pod "nginx-6bf9f7cf5d-22q48": networkPlugin cni failed to set up pod "nginx-6bf9f7cf5d-22q48_default" network: add cmd: failed to assign an IP address to container
  Normal   SandboxChanged          119s (x4 over 3m13s)  kubelet            Pod sandbox changed, it will be killed and re-created.
  Normal   Pulling                 56s (x3 over 99s)     kubelet            Pulling image "httpd:latest"
  Warning  Failed                  56s (x3 over 99s)     kubelet            Failed to pull image "httpd:latest": rpc error: code = Unknown desc = Error response from daemon: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
  Warning  Failed                  56s (x3 over 99s)     kubelet            Error: ErrImagePull
  Normal   BackOff                 43s (x4 over 98s)     kubelet            Back-off pulling image "httpd:latest"

如果您在达到拉取速率限制后尝试从公共 Docker Hub 存储库中提取图像,则会被阻止。有关更多信息,请参阅 Docker Hub 网站上的 Docker Hub rate limit

AWS 官方
AWS 官方已更新 9 个月前