The Amazon Elastic Container Service (Amazon ECS) deployment circuit breaker set my deployment state to FAILED. I want to troubleshoot what caused the deployment to fail.
Short description
When the number of consecutive failures in a deployment reaches the defined threshold, the deployment circuit breaker sets the deployment state to FAILED. You might receive the following error message:
"Resource handler returned message: "Error occurred during operation 'ECS Deployment Circuit Breaker was triggered'." (RequestToken: xxxxxxxx-xxxx-xxxxxx-xxxxxxx, HandlerErrorCode: GeneralServiceException)"
The following issues can cause your deployment to fail:
- A container failed the health check.
- A target group failed the Application Load Balancer health checks.
- AWS Cloud Map service health check failed when configured.
- The Amazon Elastic Container Registry (Amazon ECR) image doesn't exist.
- Your container instances didn't meet all the task placement criteria.
- A task from new deployment stopped or failed to start consecutively.
Resolution
To troubleshoot this issue, check the Amazon ECS service event messages to identify why Amazon ECS activated the circuit breaker. Then, take the following actions based on the reason.
Resolve container health check failures
If the Amazon ECS containers in your task can't pass the health checks, then you receive the following error message:
"(service AWS-Service) (task ff3e71a4-d7e5-428b-9232-2345657889) failed container health checks."
To resolve this issue, take the following actions:
To troubleshoot container health check failures, see How do I troubleshoot container health check failures for Amazon ECS tasks?
Resolve Application Load Balancer health check failures
To resolve this issue, take the following actions:
- Verify that you correctly configured your target group's health check settings.
- Confirm that your application correctly responds to the specified health check request.
- Confirm that no network or security group issues block the health check requests.
To troubleshoot Application Load Balancer health check failures, see How do I troubleshoot failed health checks for Application Load Balancers?
Note: Amazon ECS initiates a rollback only when health check failures are consecutive.
Resolve AWS Cloud Map health check failures
If you configured AWS Cloud Map with external health checks or custom health checks, review whether the application endpoint is healthy.
To determine application endpoint health, see How Amazon Route 53 determines whether a health check is healthy. For more information about AWS Cloud Map health checks, see AWS Cloud Map service health check configuration.
Resolve missing Amazon ECR image issues
To resolve this issue, see How do I resolve the "Image does not exist" error when my tasks fail to start in my Amazon ECS cluster?
Resolve task placement criteria issues
To resolve this issue, see How do I resolve the "no container instance met all of its requirements" error in Amazon ECS?
Resolve task startup failures
To resolve this issue, complete the following steps:
- Use Amazon CloudWatch Logs Insights to review your logs and the DescribeTasks API to get the task's stoppedReason.
- Confirm that the cluster has active instances.
- Confirm that the task's CPU or memory doesn't exceed the container instance's CPU or memory.
To troubleshoot task startup failures, see Why is my Amazon ECS task stopped? To troubleshoot tasks that fail to start, see How do I troubleshoot Amazon ECS tasks that stop or fail to start when my container exits?
Related information
Announcing Amazon ECS deployment circuit breaker