Why is my Amazon ECS on an Amazon EC2 instance unable to join the cluster?
I can't use an Amazon Elastic Container Service (Amazon ECS) cluster to register my Amazon Elastic Compute Cloud (Amazon EC2) instance.
Resolution
Prerequisite
Before you complete the following manual steps, use the AWSSupport-TroubleshootECSContainerInstance AWS Systems Manager runbook to automatically check for potential issues.
The AWSSupport-TroubleshootECSContainerInstance AWS Systems Manager runbook automatically troubleshoots common reasons why your Amazon EC2 instance can't register or join a cluster. The runbook checks for the following requirements:
- The user data for the instance contains the correct cluster information. For more information, see Bootstrapping Amazon ECS Linux container instances to pass data.
- The instance profile contains the required permissions.
- The network is correctly configured.
Note: Make sure that you use the AWSSupport-TroubleshootECSContainerInstance runbook in the same AWS Region where your ECS Cluster and EC2 instance are located.
If the runbook's output doesn't provide recommendations, then use the following solutions to manually troubleshoot this issue.
Verify the status of the Amazon ECS agent on the Amazon Linux 2 instance
To check if the Amazon ECS container agent on the instance runs, run the following command:
sudo systemctl status ecs
If the container agent doesn't run on your instance, then run the following command to start the agent:
sudo systemctl start ecs
The output of the command output must look similar to the following example command output:
ecs start/running, process 23403
Check launch configurations
If the launch instance is part of an AWS Application Auto Scaling group, then confirm that the Auto Scaling group's launch configuration is correct. For more information see the Create a new launch configuration step in Refreshing an Amazon ECS Container Instance Cluster with a new AMI.
Check the Amazon Machine Image (AMI) of your instance
If the AMI that you use for the EC2 instance is a copied or custom AMI, then confirm that the instance has the following requirements:
- A Linux distribution that runs at least version 3.10 of the Linux kernel.
- Latest version of the Amazon ECS Linux container agent.
- A Docker daemon that runs at least version 1.9.0 and any Docker runtime dependencies. For more information, see Install Docker Engine from binaries on the Docker website. To view the current Docker version, run the sudo docker version command. For more information, see Install Docker Engine on the Docker website.
These requirements are preconfigured on the Amazon ECS optimized AMI. It's a best practice to use an Amazon ECS optimized AMI unless your application requires a version that's not yet available in that AMI.
Check if an instance's user data contains the correct cluster information
To check if an instance's user data contains the correct cluster information, run the following command:
#!/bin/bash echo ECS_CLUSTER=<cluster-name> >> /etc/ecs/ecs.config
Verify the log files
If the issue persists, use Amazon ECS logs collector to collect the logs. Then, review the logs to find the cause. You can also check log files on the container host for the container agent and Docker.
To view the log files for the container agent and Docker, run the following commands:
sudo cat /var/log/ecs/ecs-agent.log.YYYY-MM-DD-**sudo cat /var/log/docker
Troubleshoot common errors
Error: Launching a new EC2 instance. Status Reason: This account is currently blocked and not recognized as a valid account. Please contact aws-verification@amazon.com if you have questions. Launching EC2 instance failed.
This error occurs when your account is blocked and Amazon doesn't recognize your account. To unblock your account, send an email to aws-verification@amazon.com. Make sure to include in your email that you need your account unblocked.
Error: re-registering: ClientException: Container instance 12345678910xxxxxxxxxxxx is inactive.\n\tstatus code: 400, request id: 012345678a-012345b-012ab-0a1-9f645f4s5c12" module=agent.go
This error occurs when the ECS agent can't use the ECS cluster to register the EC2 container instance because the EC2 instance is inactive. This error is related to the application that runs on the instance. To understand the cause of the error, first check the application. If the error persists, then check the ECS agent logs.
Error: Few instances are able to join the cluster but with the same configurations, other instance are not able to join the cluster.
This error occurs because of a ThrottlingException that results when a specific API call exceeds the rate limit. To resolve this error, increase the account-level rate limit. Check for APIs, such as RegisterTargets and RegisterContainerInstance.
Error: After changing the instance type, new instances are unable to join the cluster.
This error occurs when the ECS agent is stuck in the Pending state and you can't change the instance type. To change the instance type in Amazon ECS, complete the following steps:
- Delete the container instance.
- Launch a new container instance that has the new instance size.
Note: It's a best practice to use an Amazon ECS optimized Amazon Linux 2 AMI to launch the instance for your cluster.
Or, you can create a new launch configuration. Then, update the launch configuration in the Auto Scaling group.
For more information, see How do I change my container instance type in Amazon ECS?
Error: Unable to register as a container instance with ECS: AccessDeniedException: User: arn:aws:sts::1122334455:assumed-role/ecsInstanceRole/i-00aa11bb22cc33def is not authorized to perform: ecs:RegisterContainerInstance on resource: arn:aws:ecs:us-east-1:1122334455:cluster/exampleCluster . status code: 400, request id: 0a123456-7899-10101-a987-6543210deff
-or-
Error: 2019-06-29T16:10:09Z [ERROR] Error re-registering: AccessDeniedException: User: arn:aws:sts::1122334455:assumed-role/ecsInstanceRole/i-0052b2e858b1891ef is not authorized to perform: ecs:RegisterContainerInstance on resource: arn:aws:ecs:us-east-1:1122334455:cluster/exampleCluster status code: 400, request id: 0a123456-7899-10101-a987-123456pqrs
These errors occur due to missing IAM permissions. To resolve these errors, you must create a container instance IAM role.
Then, run the AWSSupport-TroubleshootECSContainerInstance runbook to determine which permissions are missing from the container instance role.
Related information
Create a virtual private cloud
Why are my Amazon ECS container instances with Amazon Linux 1 AMIs disconnected?
Relevant content
- asked 4 years agolg...
- Accepted Answerasked 2 years agolg...
- asked 7 months agolg...
- asked 2 years agolg...
- asked 5 months agolg...
- AWS OFFICIALUpdated 14 days ago
- AWS OFFICIALUpdated 4 months ago