Unable to register a Container Instance with ECS

0

I have a backend app that I have dockerized and want to run it on ECS using EC2 Launch Type , This backend app Is connected to RDS Instance In another account (i.e. backend app and RDS Instance are not In the same VPC) , So I have set up a VPC Peering Connection between them both. I have a few questions :

1- Does attaching an ALB with the Task to be launched In a Private Subnet is mandatory ? Taking Into consideration that the container launched In the task will connect to the RDS Instance In the peered VPC as If it Is In the same VPC through the peered Route, and the RDS Instance will respond back to the backend container through the peering connection route ? (I believe the NAT gateway will be used just to for the container In the task will be able to pull the docker Image from ECR) so, I don't get It where the ALB will fit in this current architecture ?

NOTE : I have already tried to connect ALB of IP Address type and connected it to the Task as a target group and I face the same problem where the task Is In Pending State .

2- After creating the Service , It Is Stuck in CREATE_IN_PROGRESS state , and I fixed It by following this answer as a guide https://repost.aws/questions/QUIz_pDFKUQ-222yiXF9wmww/ecs-service-creation-is-stuck-in-create-in-progress-status ,But after setting back the desired count to 1 , the task Is In a pending State forever , I followed this troubleshooting guide https://repost.aws/knowledge-center/ecs-tasks-stuck-pending-state , and after ssh into the ec2-Instance which Is supposed to host task and viewing the ecs.agent.log I can see this error

"Unable to register as a container instance with ECS" error="RequestError: send request failed\ncaused by: Post \"https://ecs.us-east-2.amazonaws.com/\": dial tcp 52.95.17.153:443: i/o timeout
Error registering container instance" error="RequestError: send request failed\ncaused by: Post \"https://ecs.us-east-2.amazonaws.com/\"

why would the ec2-Instance can't be registered with ECS ? , I have verified that the ecs-agent Is active using this command

sudo systemctl status ecs

This is my task definiton

{
    "family": "abc",
    "containerDefinitions": [
        {
            "name": "abc-container",
            "image": "<xyz>.dkr.ecr.us-east-2.amazonaws.com/my-backend:latest",
            "cpu": 1024,
            "memoryReservation": 1024,
            "portMappings": [
                {
                    "name": "abc",
                    "containerPort": 8080,
                    "hostPort": 8080,
                    "protocol": "tcp",
                    "appProtocol": "http"
                }
            ],
            "essential": true,
            "environment": [
                {
                    "name": "MYSQL_DATABASE",
                    "value": "value-1"
                },
                {
                    "name": "MYSQL_USERNAME",
                    "value": "value-2"
                },
                {
                    "name": "MYSQL_PASSWORD",
                    "value": "value-3"
                },
                {
                    "name": "MYSQL_PORT",
                    "value": "value-3"
                },
                {
                    "name": "MYSQL_HOST",
                    "value": "value-4"
                }
            ],
            "environmentFiles": [],
            "mountPoints": [],
            "volumesFrom": [],
            "ulimits": [],
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-create-group": "true",
                    "awslogs-group": "/ecs/my-backend",
                    "awslogs-region": "us-east-2",
                    "awslogs-stream-prefix": "ecs"
                },
                "secretOptions": []
            },
            "systemControls": []
        }
    ],
    "taskRoleArn": "arn:aws:iam::<xyz>:role/ecsTaskExecutionRole",
    "executionRoleArn": "arn:aws:iam::<xyz>:role/ecsTaskExecutionRole",
    "networkMode": "awsvpc",
    "requiresCompatibilities": [
        "EC2"
    ],
    "cpu": "1024",
    "memory": "3072",
    "runtimePlatform": {
        "cpuArchitecture": "ARM64",
        "operatingSystemFamily": "LINUX"
    }
}

This Is my Dockerfile

FROM maven:3.9.6-eclipse-temurin-8 AS build
# Set Working Directoryc
WORKDIR /app
# Copy the pom.xml and the project files to the container
COPY pom.xml .
COPY src ./src
# Build the application using maven
RUN mvn clean package -DskipTests

#Use an Official eclipse-temurin image as the base Image
FROM --platform=linux/arm64 eclipse-temurin:8-jre
#set the working directory inside the container
WORKDIR /app
EXPOSE 8080
#Copy the built JAR file from the previous stage to the container
COPY --from=build /app/target/my-app.jar .
#set the command to run the application
CMD ["java","-jar","my-app.jar"]

This is my Docker Compose File where I reference the env variables defined in the task

version: '3.7'
services:
   my-app:
    container_name: my-app
    build:
       context: .
       dockerfile: Dockerfile
    ports:
      - "8080:8080"
    environment:
      - MYSQL_HOST=${MYSQL_HOST}
      - MYSQL_USERNAME=${MYSQL_USERNAME}
      - MYSQL_PASSWORD=${MYSQL_PASSWORD}
      - MYSQL_DATABASE=${MYSQL_DATABASE}
      - MYSQL_PORT=${MYSQL_PORT}

Parameters I used to create the Cluster :

  • AWS EC2 Instances Selected

  • AWS Fargate Unselected

  • Created new ASG

  • AMI : Amazon linux 2 (kernel 5.10)

  • Desired Capacity: min 0 , max: 1

  • SSH pem key enabled

  • Instance type : t3.medium (2 vCPU, 4 GB Memory)

  • Network :

       - VPC ( chose the VPC Which is Peered with the VPC which has the RDS Instance )
    
       - Subnet ( Only chose the Private Subnet with a NAT Gateway and a route to the peered VPC)
    
       - Security Group ( allowing All TCP Inbound Connections on all Ports)
    
Mahmoud
asked 11 days ago283 views
4 Answers
5

hlo,

1.ALB (Application Load Balancer): In your architecture, since your backend app communicates directly with the RDS instance through VPC peering, you might not necessarily need an ALB if your application doesn't need to be accessed publicly. ALB is typically used for routing incoming HTTP or HTTPS traffic to your containers. If your backend app doesn't need to serve HTTP requests from the internet, you might not need an ALB.However, if your backend app needs to serve HTTP requests and you want to access it from the internet, then you would need an ALB. But remember, if your containers are in private subnets, the ALB needs to be in a public subnet with a route to the internet through an internet gateway or NAT gateway.

2.ECS Service Stuck in CREATE_IN_PROGRESS State: This issue could be due to various reasons. One common reason is insufficient resources in your ECS cluster or issues with IAM permissions. Ensure that your ECS cluster has sufficient resources (CPU, memory) available and that the IAM roles used by ECS have necessary permissions to perform tasks like registering container instances.The error message you're seeing, "Unable to register as a container instance with ECS," indicates a problem with the registration of the EC2 instance as a container instance with ECS. This could be due to network issues or permission issues.

3.Network Issues: The error message mentions a timeout when trying to connect to ECS API endpoint (ecs.us-east-2.amazonaws.com). Ensure that the EC2 instance has outbound internet access, and there are no network restrictions (like firewall rules) blocking outbound connections to ECS endpoint.

4.IAM Permissions: Double-check the IAM role (ecsInstanceRole) attached to your EC2 instances. Ensure it has the necessary permissions to interact with ECS, including registering container instances. You can attach the managed policy AmazonEC2ContainerServiceforEC2Role to this role to grant the required permissions.

answered 11 days ago
  • II have double checked the points you mention, but no luck so far

1

Hi

Here is the answers for your questions:

  1. You're correct that an ALB in a private subnet isn't mandatory in this scenario. Since your backend container connects to the RDS instance in the peered VPC using the VPC peering connection, the ALB wouldn't be directly involved in routing traffic.

  2. Register Issue

  1. Service Issue:
  • Ensure the instance is registred, You can see in the Infrastrucure section in ECS console , Ensure that your ASG has a minimum desired capacity of at least 1

Additional Information: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/asg-capacity-providers.html

profile picture
GK
answered 11 days ago
profile picture
EXPERT
reviewed 11 days ago
0

After Fixing the Network Connectivity Issue , (turned out that I was Placing the NAT Gateway in a Private Subnet Instead of a public one) , the task still In Provisioning Status , and when I open the ecs.agent.log I see the following output repeatedly In an endless loop respectively :

Connected to TCS endpoint

Websocket connection established.

Connected to ACS endpoint

Successfully loaded ebs-csi-driver container image from tarball

"Successfully loaded Managed Daemon image

Image excluded from cleanup" image="ebs-csi-driver:latest"

TCS WebSocket connection closed for a valid reason

Reconnecting to ACS immediately without waiting

Using cached DiscoverPollEndpoint

Establishing a Websocket connection

Websocket connection established

Connected to ACS endpoint

What does It mean ?

Mahmoud
answered 10 days ago
0

Thanks for your help , I have deleted the entire cluster and created a new one again , and I have configured an ALB balancer and attached it to the service , I was using a t3.medium Instance which come with 2 vCPUs, 4GB Memory , I have set the Task Size to 1 vCPU and Memory to 3 GB and as I was planning to only run 1 container, In the Resource allocation limits section of the container, I set the CPU to 1 vCPU and Memory soft limit and hard limit to 1 and 3 respectively and It worked but another problem popped up.

Mahmoud
answered 7 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions