My Amazon Elastic Container Service (Amazon ECS) task is taking a long time to move to the STOPPED state. Or, my Amazon ECS task is stuck in the RUNNING state when the container instance is set to DRAINING.
Short description
When you set an ECS instance to DRAINING, Amazon ECS prevents new tasks from being scheduled for placement on the container instance. Amazon ECS also stops tasks on the container instance that are in the RUNNING state.
Issues with configuration parameters or tasks can keep tasks in the RUNNING state or delay their transition to the STOPPED state.
To troubleshoot these issues, complete the following tasks:
Resolution
To troubleshoot Amazon ECS tasks that take a long time to stop, complete the following tasks.
Update your DeploymentConfiguration parameters
Complete the following steps:
- Open the Amazon ECS console.
- In the navigation pane, choose Clusters. Then, choose the cluster where your container instance is draining.
- Choose the Infrastructure tab.
- Under Container instances, filter by Status for DRAINING.
- Choose your container instance, and then find out the service for the tasks that are draining or taking a long time to drain.
- Choose the Services tab, select the service, and then choose Deployments.
- Check the values for minimumHealthyPercent and maximumPercent.
Note: Service tasks on the container instance that are in the RUNNING state are stopped and replaced according to the service's deployment configuration parameters. For more information, see Draining Amazon ECS container instances.
Update the deregistration delay value
Important: The following steps apply only to services that use the Application Load Balancer or Network Load Balancer. If your service uses the Classic Load Balancer, then check the connection draining values.
Complete the following steps:
- Open the Amazon ECS console.
- In the navigation pane, choose Clusters, and then choose the cluster where your container instance is draining.
- Choose the Services tab, and then select the service with the stack that's stuck in RUNNING.
- Choose Target Group Name.
- On the Details tab, scroll down, and then select the Deregistration delay check box.
Update the ECS_CONTAINER_STOP_TIMEOUT value
Complete the following steps:
-
Use SSH to connect to your container instance.
-
To find the ECS_CONTAINER_STOP_TIMEOUT value, run the following command:
docker inspect ecs-agent --format '{{json .Config.Env}}'
-
If there's a value for ECS_CONTAINER_STOP_TIMEOUT, then increase the duration value.
Note: ECS_CONTAINER_STOP_TIMEOUT is an ECS container agent parameter that defines the amount of time that Amazon ECS waits before ECS ends a container. The time duration starts counting when a task is stopped. If you don't see the ECS_CONTAINER_STOP_TIMEOUT parameter in the output after you run the command, then Amazon ECS is using the default value of 30s.
Look for other task-related issues
To look for other task-related issues, use SSH to connect to your container instance. Then, complete the following tasks: