为什么统一 CloudWatch 代理未将我的指标或日志事件推送到 CloudWatch?
我在我的 Amazon Elastic Compute Cloud(Amazon EC2)实例上配置了统一 Amazon CloudWatch 代理,以将指标和日志发布到 Amazon CloudWatch。但我在 CloudWatch 控制台中看不到我的指标或日志。我想要在 CloudWatch 中查看我的指标和日志事件
简短描述
统一 CloudWatch 代理未将您的指标或日志推送到 CloudWatch 的原因可能有多种。例如,您可能遇到了权限或连接错误,导致代理无法发布指标。在查看统一 CloudWatch 代理日志时,可能会出现其中一种错误:
- 代理日志错误: 无法连接到端点
- 代理日志错误: 权限不足
解决方法
**注意:**如果在运行 AWS 命令行界面(AWS CLI)命令时收到错误,请参阅 Troubleshoot AWS CLI errors。此外,请确保使用的是最新版本的 AWS CLI。
查看统一 CloudWatch 代理日志
可使用代理日志文件来帮助解决在使用统一 CloudWatch 代理包时遇到的问题。
可能会遇到以下其中一种问题:
- 与所需的 AWS 服务端点或 Amazon Virtual Private Cloud(Amazon VPC)端点的连接出现问题。
- 您没有正确的权限对 CloudWatch 进行支持 API 调用。
可能会在以下日志中看到其中一个错误。
代理日志错误: 无法连接到端点
2021-08-30T04:07:46Z E! cloudwatch: code: RequestError, message: send request failed, original error: Post "https://monitoring.us-east-1.amazonaws.com/": dial tcp 172.31.11.121:443: i/o timeout 2021-08-30T04:07:46Z W! 210 retries, going to sleep 1m0s before retrying. 2021-08-30T04:07:46Z E! cloudwatch: code: RequestError, message: send request failed, original error: Post "https://monitoring.us-east-1.amazonaws.com/": dial tcp 172.31.11.121:443: i/o timeout 2021-08-30T04:07:46Z W! 211 retries, going to sleep 1m0s before retrying.
代理日志错误: 权限不足
2021-08-30T02:15:45Z E! cloudwatch: code: AccessDenied, message: User: arn:aws:sts::123456789012:assumed-role/cwagent/i-0744de7c842d2c2ba is not authorized to perform: cloudwatch:PutMetricData, original error: 2021-08-30T02:15:45Z W! 1 retries, going to sleep 400ms before retrying. 2021-08-30T02:15:46Z E! WriteToCloudWatch failure, err: AccessDenied: User: arn:aws:sts::123456789012:assumed-role/cwagent/i-0744de7c842d2c2ba is not authorized to perform: cloudwatch:PutMetricData status code: 403, request id: f1171fd0-05b6-4f7d-bac2-629c8594c46e
确认与 CloudWatch 端点的连接
如果流向 CloudWatch 的流量不经过公共互联网,则可以使用 Amazon VPC 端点。如果使用 Amazon VPC 端点,请检查下面的参数:
- 如果使用私有名称服务器,请确认 DNS 解析提供了准确的响应。
- 确认 CloudWatch 端点解析为私有 IP 地址。
- 确认与允许来自主机的入站流量的 Amazon VPC 端点关联的安全组。
要确认与 CloudWatch 端点的连接,请完成下面的步骤:
-
要检查与指标端点的连接,请运行下面的命令:
$ telnet monitoring.us-east-1.amazonaws.com 443 Trying 52.46.138.115... Connected to monitoring.amazonaws.com. Escape character is '^]'. ^] telnet> quit Connection closed.
-
要检查与日志端点的连接,请运行下面的命令:
$ telnet logs.us-east-1.amazonaws.com 443 Trying 3.236.94.218... Connected to logs.us-east-1.amazonaws.com. Escape character is '^]'. ^] telnet> quit Connection closed
-
要检查 Amazon VPC 端点是否解析为私有 IP 地址,请运行下面的命令:
$ dig monitoring.us-east-1.amazonaws.com +short172.31.11.121 172.31.0.13
查看统一 CloudWatch 代理配置
代理配置文件详细说明了发布至 CloudWatch 的指标和日志。请查看代理配置文件,确认包含要发布的日志和指标。
确认主机有权发布指标和日志
AWS 托管策略 CloudWatchAgentServerPolicy 和 CloudWatchAgentAdminPolicy 可帮助部署统一 CloudWatch 代理。这些策略还可帮助检查您是否拥有正确的权限。请使用这些策略作为参考,确保您的主机拥有正确的权限。
这些示例中的 AWS CLI 输出显示权限不足。
以下 AWS CLI config 命令显示缺少连接到 EC2 实例的 AWS Identity and Access Management(IAM)角色:
$ /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c ssm:CWT-Web-Server -s ****** processing amazon-cloudwatch-agent ****** /opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --download-source ssm:CWT-Web-Server --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default Region: us-east-1 credsConfig: map[] Error in retrieving parameter store content: NoCredentialProviders: no valid providers in chain. Deprecated. For verbose messaging see aws.Config.CredentialsChainVerboseErrors Fail to fetch/remove json config: NoCredentialProviders: no valid providers in chain. Deprecated. For verbose messaging see aws.Config.CredentialsChainVerboseErrors Fail to fetch the config!
以下 AWS CLI config 命令显示错误的 IAM 角色已连接到 EC2 实例:
$ /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c ssm:CWT-Web-Server -s ****** processing amazon-cloudwatch-agent ****** /opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --download-source ssm:CWT-Web-Server --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default Region: us-east-1 credsConfig: map[] Error in retrieving parameter store content: AccessDeniedException: User: arn:aws:sts::123456789012:assumed-role/cwagent/i-0744de7c842d2c2ba is not authorized to perform: ssm:GetParameter on resource: arn:aws:ssm:us-east-1:123456789012:parameter/CWT-Web-Server status code: 400, request id: b85b0a7a-0fb1-47b4-924f-be8cf43a3b4d Fail to fetch/remove json config: AccessDeniedException: User: arn:aws:sts::123456789012:assumed-role/cwagent/i-0744de7c842d2c2ba is not authorized to perform: ssm:GetParameter on resource: arn:aws:ssm:us-east-1:123456789012:parameter/CWT-Web-Server status code: 400, request id: b85b0a7a-0fb1-47b4-924f-be8cf43a3b4d Fail to fetch the config!
以下 get-caller-identity 命令会返回与实例关联的 IAM 用户或角色:
$ aws sts get-caller-identity { "UserId": "AROA123456789012ABCDE:i-0744de7c842d2c2ba", "Account": "123456789012", "Arn": "arn:aws:sts::123456789012:assumed-role/CloudWatchAgentServerRole/i-0744de7c842d2c2ba" }
确认代理正确启动
可以使用 AWS CLI 并将配置文件作为参数传递来启动代理。要启动代理,请运行下面有效的启动命令。
对于 Linux,运行下面的命令:
- `$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:configuration-file-path` - `$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c ssm:configuration-parameter-store-name`
对于 Windows,运行下面的命令:
- `& "C:\Program Files\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1" -a fetch-config -m ec2 -s -c file:"C:\Program Files\Amazon\AmazonCloudWatchAgent\config.json"` - `& "C:\Program Files\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1" -a fetch-config -m ec2 -s -c ssm:configuration-parameter-store-name`
重要信息: 请勿从 Windows 控制面板启动代理。
确认代理运行
要发布指标和日志,代理必须处于活动状态。要确认代理处于活动状态,请运行下面的命令:
$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status { "status": "running", "starttime": "2021-08-30T02:13:44+00:00", "configstatus": "configured", "cwoc_status": "stopped", "cwoc_starttime": "", "cwoc_configstatus": "not configured", "version": "1.247349.0b251399" }
在更新代理配置后重启代理
代理不会自动注册对配置文件的更改。如果代理配置已更新,包括新的或不同的指标和日志,则必须使用下面的命令重启代理:
$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a stop ****** processing cwagent-otel-collector ****** cwagent-otel-collector has already been stopped ****** processing amazon-cloudwatch-agent ****** Redirecting to /bin/systemctl stop amazon-cloudwatch-agent.service $ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:config.json ****** processing amazon-cloudwatch-agent ****** /opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --download-source file:config.json --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default Successfully fetched the config and saved in /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_config.json.tmp Start configuration validation... /opt/aws/amazon-cloudwatch-agent/bin/config-translator --input /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json --input-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --output /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default 2021/08/31 02:45:37 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_config.json.tmp ... Valid Json input schema. I! Detecting run_as_user... Configuration validation first phase succeeded /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -schematest -config /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml Configuration validation second phase succeeded Configuration validation succeeded amazon-cloudwatch-agent has already been stopped Redirecting to /bin/systemctl restart amazon-cloudwatch-agent.service $ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status { "status": "running", "starttime": "2021-08-31T02:45:37+0000", "configstatus": "configured", "cwoc_status": "stopped", "cwoc_starttime": "", "cwoc_configstatus": "not configured", "version": "1.247349.0b251399" }
相关信息
相关内容
- AWS 官方已更新 7 个月前
- AWS 官方已更新 2 年前
- AWS 官方已更新 2 年前
- AWS 官方已更新 4 个月前