Why is my Amazon ECS on an Amazon EC2 instance unable to join the cluster?

5 minute read
0

I can't use an Amazon Elastic Container Service (Amazon ECS) cluster to register my Amazon Elastic Compute Cloud (Amazon EC2) instance.

Resolution

Prerequisite

Before you complete the following manual steps, use the AWSSupport-TroubleshootECSContainerInstance AWS Systems Manager runbook to automatically check for potential issues.

The AWSSupport-TroubleshootECSContainerInstance AWS Systems Manager runbook automatically troubleshoots common reasons why your Amazon EC2 instance can't register or join a cluster. The runbook checks for the following requirements:

Note: Make sure that you use the AWSSupport-TroubleshootECSContainerInstance runbook in the same AWS Region where your ECS Cluster and EC2 instance are located.

If the runbook's output doesn't provide recommendations, then use the following solutions to manually troubleshoot this issue.

Verify the status of the Amazon ECS agent on the Amazon Linux 2 instance

To check if the Amazon ECS container agent on the instance runs, run the following command:

sudo systemctl status ecs

If the container agent doesn't run on your instance, then run the following command to start the agent:

sudo systemctl start ecs

The output of the command output must look similar to the following example command output:

ecs start/running, process 23403

Check launch configurations

If the launch instance is part of an AWS Application Auto Scaling group, then confirm that the Auto Scaling group's launch configuration is correct. For more information see the Create a new launch configuration step in Refreshing an Amazon ECS Container Instance Cluster with a new AMI.

Check the Amazon Machine Image (AMI) of your instance

If the AMI that you use for the EC2 instance is a copied or custom AMI, then confirm that the instance has the following requirements:

These requirements are preconfigured on the Amazon ECS optimized AMI. It's a best practice to use an Amazon ECS optimized AMI unless your application requires a version that's not yet available in that AMI.

Check if an instance's user data contains the correct cluster information

To check if an instance's user data contains the correct cluster information, run the following command:

#!/bin/bash  
echo ECS_CLUSTER=<cluster-name> >> /etc/ecs/ecs.config

Verify the log files

If the issue persists, use Amazon ECS logs collector to collect the logs. Then, review the logs to find the cause. You can also check log files on the container host for the container agent and Docker.

To view the log files for the container agent and Docker, run the following commands:

sudo cat /var/log/ecs/ecs-agent.log.YYYY-MM-DD-**sudo cat /var/log/docker

Troubleshoot common errors

Error: Launching a new EC2 instance. Status Reason: This account is currently blocked and not recognized as a valid account. Please contact aws-verification@amazon.com if you have questions. Launching EC2 instance failed.

This error occurs when your account is blocked and Amazon doesn't recognize your account. To unblock your account, send an email to aws-verification@amazon.com. Make sure to include in your email that you need your account unblocked.

Error: re-registering: ClientException: Container instance 12345678910xxxxxxxxxxxx is inactive.\n\tstatus code: 400, request id: 012345678a-012345b-012ab-0a1-9f645f4s5c12" module=agent.go

This error occurs when the ECS agent can't use the ECS cluster to register the EC2 container instance because the EC2 instance is inactive. This error is related to the application that runs on the instance. To understand the cause of the error, first check the application. If the error persists, then check the ECS agent logs.

Error: Few instances are able to join the cluster but with the same configurations, other instance are not able to join the cluster.

This error occurs because of a ThrottlingException that results when a specific API call exceeds the rate limit. To resolve this error, increase the account-level rate limit. Check for APIs, such as RegisterTargets and RegisterContainerInstance.

Error: After changing the instance type, new instances are unable to join the cluster.

This error occurs when the ECS agent is stuck in the Pending state and you can't change the instance type. To change the instance type in Amazon ECS, complete the following steps:

  1. Delete the container instance.
  2. Launch a new container instance that has the new instance size.
    Note: It's a best practice to use an Amazon ECS optimized Amazon Linux 2 AMI to launch the instance for your cluster.

Or, you can create a new launch configuration. Then, update the launch configuration in the Auto Scaling group.

For more information, see How do I change my container instance type in Amazon ECS?

Error: Unable to register as a container instance with ECS: AccessDeniedException: User: arn:aws:sts::1122334455:assumed-role/ecsInstanceRole/i-00aa11bb22cc33def is not authorized to perform: ecs:RegisterContainerInstance on resource: arn:aws:ecs:us-east-1:1122334455:cluster/exampleCluster . status code: 400, request id: 0a123456-7899-10101-a987-6543210deff

-or-

Error: 2019-06-29T16:10:09Z [ERROR] Error re-registering: AccessDeniedException: User: arn:aws:sts::1122334455:assumed-role/ecsInstanceRole/i-0052b2e858b1891ef is not authorized to perform: ecs:RegisterContainerInstance on resource: arn:aws:ecs:us-east-1:1122334455:cluster/exampleCluster status code: 400, request id: 0a123456-7899-10101-a987-123456pqrs

These errors occur due to missing IAM permissions. To resolve these errors, you must create a container instance IAM role.

Then, run the AWSSupport-TroubleshootECSContainerInstance runbook to determine which permissions are missing from the container instance role.

Related information

Create a virtual private cloud

Why are my Amazon ECS container instances with Amazon Linux 1 AMIs disconnected?

Amazon ECS troubleshooting

Creating your own runbooks