如何配置和监控 Amazon ECS 部署断路器?
我想在我的 Amazon Elastic Container Service (Amazon ECS) 部署失败时自动回滚并收到通知。
简短描述
要使用 Amazon ECS 部署断路器自动回滚和监控部署,请完成以下步骤:
- 配置部署断路器。
- 配置 Amazon EventBridge 以监控部署断路器。
- 测试部署失败场景。
解决方法
**注意:**如果您在运行 AWS 命令行界面 (AWS CLI) 命令时收到错误,请参阅 AWS CLI 错误故障排除。此外,确保您使用的是最新版本的 AWS CLI。
配置部署断路器
完成以下步骤:
- 使用与以下示例类似的任务定义创建一个 JSON 文件:
**注意:**请将 123456789876 替换为您的 AWS 账户 ID。如果您没有 ecsTaskExecutionRole,请创建任务执行角色。{ "family": "my-task", "containerDefinitions": [ { "name": "sample-container", "image": "nginx:alpine", "essential": true } ], "executionRoleArn": "arn:aws:iam::123456789876:role/ecsTaskExecutionRole", "networkMode": "awsvpc", "requiresCompatibilities": [ "FARGATE" ], "cpu": "256", "memory": "512" } - 要注册任务定义,请运行以下 register-task-definition 命令:
**注意:**请将 taskdef-success.json 替换为您的任务定义 JSON 文件。aws ecs register-task-definition \ --cli-input-json file://taskdef-success.json - 要在激活部署断路器和回滚的情况下创建 Amazon ECS 服务,请运行以下 create-service 命令:
**注意:**请将 subnet-12345 替换为您的子网,将 sg-12345 替换为您的安全组。必须将 deployment-controller 设置为 type=ECS,因为部署断路器仅适用于滚动更新部署。aws ecs create-service \ --cluster default \ --service-name my-sample-service \ --deployment-controller type=ECS \ --desired-count 1 \ --deployment-configuration "deploymentCircuitBreaker={enable=true,rollback=true}" \ --task-definition my-task:1 \ --launch-type FARGATE \ --network-configuration "awsvpcConfiguration={subnets=[subnet-12345],securityGroups=[sg-12345],assignPublicIp=ENABLED}"
如果您没有默认集群,请运行以下 create-cluster 命令来创建集群:
**注意:**请将 example-cluster 替换为您的集群名称。aws ecs create-cluster \ --cluster-name example-cluster - 运行以下 describe-services 命令,确认 Amazon ECS 服务处于 Steady(稳定)状态:
您收到的输出类似于以下示例:aws ecs describe-services \ --cluster default \ --services my-sample-service | jq '.services[0].events[] | {message}'{ "message": "(service my-sample-service) has reached a steady state." } { "message": "(service my-sample-service) (deployment ecs-svc/1234567890123456789) deployment completed." } { "message": "(service my-sample-service) has started 1 tasks: (task 2918eb15dd0f4d42affc2a3a07818abf)." }
配置 EventBridge 以监控部署断路器
完成以下步骤:
-
运行以下 create-topic 命令创建 Amazon Simple Notification Service (Amazon SNS) 主题,以用作 EventBridge 规则目标:
aws sns create-topic \ --name my-topic**注意:**请将 my-topic 替换为您的 SNS 主题名称。
-
要更新主题属性以允许调用所需的 API,请运行以下 set-topic-attributes 命令:
aws sns set-topic-attributes \ --topic-arn arn:aws:sns:eu-west-1:123456789876:my-topic \ --attribute-name Policy \ --attribute-value '{ "Version": "2008-10-17", "Id": "my_topic_policy", "Statement": [ { "Sid": "my_topic_default", "Effect": "Allow", "Principal": { "AWS": "*" }, "Action": [ "SNS:GetTopicAttributes", "SNS:SetTopicAttributes", "SNS:AddPermission", "SNS:RemovePermission", "SNS:DeleteTopic", "SNS:Subscribe", "SNS:ListSubscriptionsByTopic", "SNS:Publish" ], "Resource": "arn:aws:sns:eu-west-1:123456789876:my-topic", "Condition": { "StringEquals": { "AWS:SourceOwner": "123456789876" } } }, { "Sid": "my_topic_for_sns_Publish", "Effect": "Allow", "Principal": { "Service": "events.amazonaws.com" }, "Action": "sns:Publish", "Resource": "arn:aws:sns:eu-west-1:123456789876:my-topic" } ] }'**注意:**请将 eu-west-1 替换为您的 AWS 区域,将 123456789876 替换为您的账户 ID,将 my-topic 替换为您的主题名称。
-
要通过电子邮件订阅 SNS 主题,请运行以下 subscribe 命令:
aws sns subscribe \ --topic-arn arn:aws:sns:eu-west-1:123456789876:my-topic \ --protocol email \ --notification-endpoint example@example.com**注意:**请将 eu-west-1 替换为您的区域,将 123456789876 替换为您的账户 ID,将 my-topic 替换为您的主题名称,将 example@example.com 替换为您的电子邮件地址。
-
在您收到的订阅确认电子邮件中,选择 Confirm subscription(确认订阅)。
-
要为服务部署失败事件创建 EventBridge 规则,请运行以下 put-rule 命令:
aws events put-rule \ --name "EcsServiceDeploymentFailed" \ --event-pattern "{\"source\":[\"aws.ecs\"],\"detail-type\":[\"ECS Deployment State Change\"],\"detail\":{\"eventName\":[\"SERVICE_DEPLOYMENT_FAILED\"]}}" -
要添加 SNS 主题作为 EventBridge 规则目标,请运行以下 put-targets 命令:
aws events put-targets \ --rule EcsServiceDeploymentFailed --targets "Id"="1","Arn"="arn:aws:sns:eu-west-1:123456789876:my-topic"**注意:**请将 eu-west-1 替换为您的区域,将 123456789876 替换为您的账户 ID,将 my-topic 替换为您的主题名称。
测试部署失败场景
完成以下步骤:
-
创建 JSON 文件,在其任务定义中包含不正确的映像标签,类似于以下内容:
{ "family": "my-task", "containerDefinitions": [ { "name": "sample-container", "image": "nginx:wrong-image-tag", "essential": true } ], "executionRoleArn": "arn:aws:iam::123456789876:role/ecsTaskExecutionRole", "networkMode": "awsvpc", "requiresCompatibilities": [ "FARGATE" ], "cpu": "256", "memory": "512" }**注意:**请将 sample-container 替换为您的容器实例,将 nginx:wrong-image-tag 替换为不正确的映像标签,将 123456789876 替换为您的账户 ID。不正确的映像标签会导致部署失败。
-
要注册任务定义,请运行以下 register-task-definition 命令:
aws ecs register-task-definition --cli-input-json file://taskdef-failure.json**注意:**请将 taskdef-failure.json 替换为任务定义 JSON 文件的标题。
-
要使用新的任务定义更新服务并启动新部署,请运行以下 update-service 命令:
aws ecs update-service --service my-sample-service --task-definition my-task:2**注意:**请将 my-sample-service 替换为您的服务,将 my-task:2 替换为您的任务。由于任务无法拉取映像,新部署将失败。您收到的输出类似于以下示例:
{ "version": "0", "id": "12345abc-2f7c-f86a-e544-a69218eb1446", "detail-type": "ECS Deployment State Change", "source": "aws.ecs", "account": "123456789876", "time": "2024-11-19T17:42:41Z", "region": "eu-west-1", "resources": [ "arn:aws:ecs:eu-west-1:123456789876:service/default/my-sample-service" ], "detail": { "eventType": "ERROR", "eventName": "SERVICE_DEPLOYMENT_FAILED", "clusterArn": "arn:aws:ecs:eu-west-1:123456789876:cluster/default", "deploymentId": "ecs-svc/9876543210987654321", "updatedAt": "2024-11-19T17:42:40.73Z", "reason": "ECS deployment circuit breaker: tasks failed to start." } } -
要验证 Amazon ECS 服务是否已回滚,请运行以下 describe-services 命令:
aws ecs describe-services \ --cluster default \ --services my-sample-service | jq '.services[0].events[] | {message}'您收到的输出类似于以下示例:
{ "message": "(service my-sample-service) has reached a steady state." } { "message": "(service my-sample-service) (deployment ecs-svc/1234567890123456789) deployment completed." } { "message": "(service my-sample-service) rolling back to deployment ecs-svc/1234567890123456789." } { "message": "(service my-sample-service) (deployment ecs-svc/9876543210987654321) deployment failed: tasks failed to start." } { "message": "(service my-sample-service) has started 1 tasks: (task b808c60616134ec1ac0c656a2bff1ef2)." } { "message": "(service my-sample-service) has started 1 tasks: (task 846c9aebd9224c2b832a38942cae5ea6)." } { "message": "(service my-sample-service) has started 1 tasks: (task 7143d03444574f2db2b567d75df3fe72)." } { "message": "(service my-sample-service) has started 1 tasks: (task 9a6a399770d940a2b442560c02a6a4c0)." } { "message": "(service my-sample-service) has reached a steady state." } { "message": "(service my-sample-service) (deployment ecs-svc/1234567890123456789) deployment completed." } { "message": "(service my-sample-service) has started 1 tasks: (task 2918eb15dd0f4d42affc2a3a07818abf)." }
相关信息
- 语言
- 中文 (简体)
