为什么我无法在 Amazon EKS 中运行 kubectl 命令?
我无法在 Amazon Elastic Kubernetes Service(Amazon EKS)中成功运行 kubectl 命令,例如 kubectl exec、kubectl logs、kubectl attach 或 kubectl port-forward。
解决方法
通常,kubectl 命令会在 Amazon EKS 集群中失败,因为 API 服务器未与在 Worker 节点上运行的 kubelet 通信。常见的 kubectl 命令包括 kubectl exec、kubectl logs、kubectl attach 或 kubectl port-forward。
要排查此问题,请验证以下事项:
- Pod 正在辅助无类别域间路由(CIDR)范围内运行。
- 用于控制面板和节点的安全组会使用入站和出站规则的最佳实践。
- aws-auth ConfigMap 具有正确的 AWS Identity and Access Management(IAM)角色以及与节点关联的 Kubernetes 用户名。
- 已满足提交新证书的要求。
Pod 正在辅助无类别域间路由(CIDR)范围内运行
创建集群后,Amazon EKS 无法立即与从 CIDR 块在子网中启动且添加到虚拟私有云(VPC)的节点进行通信。向现有集群添加 CIDR 块所导致的更新范围可能需要长达五个小时才能被 Amazon EKS 识别。有关更多信息,请参阅 Amazon EKS VPC 和子网的要求与注意事项。
如果 Pod 正在辅助 CIDR 范围内运行,请执行以下操作:
- 最多等待五个小时,这些命令才能开始工作。
- 确保每个子网中至少有五个空闲 IP 地址,以便成功完成自动化。
使用以下示例策略查看任何 VPC 中所有子网的可用 IP 地址:
[ec2-user@ip-172-31-51-214 ~]$ aws ec2 describe-subnets --filters "Name=vpc-id,Values=vpc-078af71a874f2f068" | jq '.Subnets[] | .SubnetId + "=" + "\(.AvailableIpAddressCount)"' "subnet-0d89886ca3fb30074=8186" "subnet-0ee46aa228bdc9a74=8187" "subnet-0a0186a277b8b6a51=8186" "subnet-0d1fb1de0732b5766=8187" "subnet-077eff87a4e25316d=8187" "subnet-0f01c02b04708f638=8186"
用于控制面板和节点的安全组具有最低要求的入站和出站规则
在 Worker 节点上运行时,API 服务器至少须具有最低要求的入站和出站规则才能对 kubelet 进行 API 调用。要验证用于控制面板和节点安全组是否具有最低要求的入站和出站规则,请参阅 Amazon EKS 安全组要求和注意事项。
aws-auth ConfigMap 具有正确的 IAM 角色以及与节点关联的 Kubernetes 用户名
您必须将正确的 IAM 角色应用到 aws-auth ConfigMap。确保 IAM 角色具有与您的节点关联的 Kubernetes 用户名。要将 aws-auth ConfigMap 应用到集群,请参阅将 IAM 用户或角色添加到 Amazon EKS 集群。
已满足提交新证书的要求
Amazon EKS 集群需要节点的 kubelet 来为自己提交和轮换服务证书。当服务证书不可用时,会发生证书错误。
1. 运行以下命令以验证 kubelet 服务器证书:
cd /var/lib/kubelet/pki/# use openssl command to validate kubelet server cert sudo openssl x509 -text -noout -in kubelet-server-current.pem
输出与以下内容类似:
Certificate: Data: Version: 3 (0x2) Serial Number: 1e:f1:84:62:a3:39:32:c7:30:04:b5:cf:b0:91:6e:c7:bd:5d:69:fb Signature Algorithm: sha256WithRSAEncryption Issuer: CN=kubernetes Validity Not Before: Oct 11 19:03:00 2021 GMT Not After : Oct 11 19:03:00 2022 GMT Subject: O=system:nodes, CN=system:node:ip-192-168-65-123.us-east-2.compute.internal Subject Public Key Info: Public Key Algorithm: id-ecPublicKey Public-Key: (256 bit) pub: 04:7f:44:c6:95:7e:0f:1e:f8:f8:bf:2e:f8:a9:40: 6a:4f:83:0a:e8:89:7b:87:cb:d6:b8:47:4e:8d:51: 00:f4:ac:9d:ef:10:e4:97:4a:1b:69:6f:2f:86:df: e0:81:24:c6:62:d2:00:b8:c7:60:da:97:db:da:b7: c3:08:20:6e:70 ASN1 OID: prime256v1 NIST CURVE: P-256 X509v3 extensions: X509v3 Key Usage: critical Digital Signature, Key Encipherment X509v3 Extended Key Usage: TLS Web Server Authentication X509v3 Basic Constraints: critical CA:FALSE X509v3 Subject Key Identifier: A8:EA:CD:1A:5D:AB:DC:47:A0:93:31:59:ED:05:E8:7E:40:6D:ED:8C X509v3 Authority Key Identifier: keyid:2A:F2:F7:E8:F6:1F:55:D1:74:7D:59:94:B1:45:23:FD:A1:8C:97:9B X509v3 Subject Alternative Name: DNS:ec2-3-18-214-69.us-east-2.compute.amazonaws.com, DNS:ip-192-168-65-123.us-east-2.compute.internal, IP Address:192.168.65.123, IP Address:3.18.214.69
2. 查看 kubelet 日志中是否存在证书错误。如果您没有看到错误,则表示已满足提交新证书的要求。
kubelet 日志证书错误示例:
kubelet[8070]: I1021 18:49:21.594143 8070 log.go:184] http: TLS handshake error from 192.168.130.116:38710: no serving certificate available for the kubelet
**注意:**有关更详细的日志,请打开标记为 --v=4 的 kubelet 详细日志,然后在 Worker 节点上重新启动 kubelet。kubelet 详细日志与以下内容类似:
#kubelet verbosity can be increased by updating this file ...max verbosisty level --v=4 sudo vi /etc/systemd/system/kubelet.service.d/10-kubelet-args.conf # Normal kubelet verbosisty is 2 by default cat /etc/systemd/system/kubelet.service.d/10-kubelet-args.conf [Service] Environment='KUBELET_ARGS=--node-ip=192.168.65.123 --pod-infra-container-image=XXXXXXXXXX.dkr.ecr.us-east-2.amazonaws.com/eks/pause:3.1-eksbuild.1 --v=2 #to restart the demon and kubelet sudo systemctl daemon-reload sudo systemctl restart kubelet #make sure kubelet in running state sudo systemctl status kubelet # to stream logs for kubelet journalctl -u kubelet -f
3. 如果您看到错误,请验证 Worker 节点上的 kubelet 配置文件:/etc/kubernetes/kubelet/kubelet-config.json,然后确认 RotateKubeletServerCertificate 和 serverTLSBootstrap 标志已列为 True:
"featureGates": { "RotateKubeletServerCertificate": true }, "serverTLSBootstrap": true,
4. 运行以下 eks:node-bootstrapper 命令,以确认 kubelet 具有提交证书签名请求(CSR)所需的基于角色的访问控制(RBAC)系统权限:
$ kubectl get clusterrole eks:node-bootstrapper -o yaml apiVersion: rbac.authorization.k8s.io/v1
输出与以下内容类似:
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{},"labels":{"eks.amazonaws.com/component":"node"},"name":"eks:node-bootstrapper"},"rules":[{"apiGroups":["certificates.k8s.io"],"resources":["certificatesigningrequests/selfnodeserver"],"verbs":["create"]}]} creationTimestamp: "2021-11-09T10:07:42Z" labels: eks.amazonaws.com/component: node name: eks:node-bootstrapper resourceVersion: "199" uid: da268bf3-31a3-420a-9a71-414229437b7e rules: - apiGroups: - certificates.k8s.io resources: - certificatesigningrequests/selfnodeserver verbs: - create
所需的 RBAC 权限包括以下属性:
- apiGroups: ["certificates.k8s.io"] resources: ["certificatesigningrequests/selfnodeserver"] verbs: ["create"]
5. 运行以下命令以检查集群角色 eks:node-bootstrapper 是否绑定到 system:bootstrappers 和 system:nodes。这能让 kubelet 为自己提交和轮换服务证书。
$ kubectl get clusterrolebinding eks:node-bootstrapper -o yaml apiVersion: rbac.authorization.k8s.io/v1
输出与以下内容类似:
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRoleBinding","metadata":{"annotations":{},"labels":{"eks.amazonaws.com/component":"node"},"name":"eks:node-bootstrapper"},"roleRef":{"apiGroup":"rbac.authorization.k8s.io","kind":"ClusterRole","name":"eks:node-bootstrapper"},"subjects":[{"apiGroup":"rbac.authorization.k8s.io","kind":"Group","name":"system:bootstrappers"},{"apiGroup":"rbac.authorization.k8s.io","kind":"Group","name":"system:nodes"}]} creationTimestamp: "2021-11-09T10:07:42Z" labels: eks.amazonaws.com/component: node name: eks:node-bootstrapper resourceVersion: "198" uid: f6214fe0-8258-4571-a7b9-ff3455add7b9 roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: eks:node-bootstrapper subjects: - apiGroup: rbac.authorization.k8s.io kind: Group name: system:bootstrappers - apiGroup: rbac.authorization.k8s.io kind: Group name: system:nodes
相关内容
- AWS 官方已更新 1 年前
- AWS 官方已更新 6 个月前
- AWS 官方已更新 7 个月前
- AWS 官方已更新 1 年前