How can I get my Amazon ECS tasks that use the Amazon EC2 launch type to pass the Application Load Balancer health check?

11 minute read
0

I want to learn how to troubleshoot and resolve issues where Amazon Elastic Container Service (Amazon ECS) tasks running on my Amazon Elastic Compute Cloud (Amazon EC2) instances fail the Application Load Balancer health checks by checking connectivity, health check settings, my application configuration, and container instance status.

Short description

When your Amazon ECS task fails the load balancer health check, you receive one of the following errors from your Amazon ECS service event message:

  • "(service AWS-service) (port 8080) is unhealthy in (target-group arn:aws:elasticloadbalancing:us-east-1:111111111111:targetgroup/aws-targetgroup/123456789) due to (reason Health checks failed with these codes: [502 or 504]) or (request timeout)"
  • "(service AWS-Service) (port 8080) is unhealthy in target-group tf-20190411170 due to (reason Health checks failed)"
  • "(service AWS-Service) (instance i-1234567890abcdefg) (port 443) is unhealthy in (target-group arn:aws:elasticloadbalancing:us-east-1:111111111111:targetgroup/aws-targetgroup/123456789) due to (reason Health checks failed)"

You might also receive the following error from your Amazon ECS task console:

"Task failed ELB health checks in (target-group arn:aws:elasticloadbalancing:us-east-1:111111111111:targetgroup/aws-targetgroup/123456789)"

If you get the error "(service AWS-Service) (task c13b4cb40f1f4fe4a2971f76ae5a47ad) failed container health checks," then see How do I troubleshoot the container health check failures for Amazon ECS tasks?

Note: An Amazon ECS task can return the unhealthy status for many reasons. If the following steps don't resolve your issue, then see Troubleshooting service load balancers in Amazon ECS. To find out why your Amazon ECS task was stopped, see View Amazon ECS stopped task errors.

Resolution

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshoot AWS CLI errors. Also, make sure that you're using the most recent AWS CLI version.

To troubleshoot load balancer health check issues on your Amazon ECS task, check the following configurations:

  • Connectivity between your load balancer and Amazon ECS task
  • Health check settings of your target group
  • Status and configuration of the application in your ECS container
  • Status of the container instance

Check the connectivity between your load balancer and Amazon ECS task

To verify that your load balancer can perform health checks on your Amazon ECS tasks, review the following information.

The security groups attached to your load balancer and container instance or the Amazon ECS task elastic network interface for awsvpc network mode are configured correctly

It's a best practice to configure different security groups for your load balancer and container instance or task elastic network interface. With this approach, you allow all traffic between your load balancers and container instances or task elastic network interface. You can also configure your container instances to accept traffic on the port that's specified for the task.

Review the following configurations:

  • Confirm that the security group associated with your load balancer allows egress traffic to your container instances or task elastic network interface on the registered port. Confirm that the same is true for the health check port that's associated with your container instance, if applicable.
  • Confirm that the security group associated with your container instance or task elastic network interface allows all ingress traffic on the task host port range from the security group that's associated with your load balancer. To check the security group that's associated with your load balancer, see Security groups for your Application Load Balancer.

Important: When you use dynamic port mapping, the service is exposed on the dynamic port (typically ports 32768-65535) rather than on the host port. Confirm that your container instance security group reflects the ephemeral port range in the ingress rules for the load balancer as a source.

Your load balancer is configured in the same Availability Zone as your container instance or Amazon ECS task elastic network interface for awsvpc network mode

When you configure an Availability Zone for your load balancer, Elastic Load Balancing creates a load balancer node in the Availability Zone. If you register targets in an Availability Zone, but don't turn on the Availability Zone, then these registered targets don't receive traffic. For more information, see Availability Zones and load balancer nodes.

To find out the Availability Zones that your load balancer is configured for, complete the following steps:

  1. Open the Amazon EC2 console.
  2. In the navigation pane, under Load Balancing, choose Load Balancers.
  3. Select the load balancer that you're using for your Amazon ECS service.
  4. On the Description tab, you can view the Availability Zones under the Availability Zones field.

Note: For an Application Load Balancer, you can turn on or turn off the Availability Zones at any time. For a Network Load Balancer, you can't turn off the Availability Zones after you turn on this option. Instead, turn on additional Availability Zones.

If you use Application Load Balancers, then cross-zone load balancing is always turned on. If you use Network Load Balancers, then cross-zone load balancing is turned off by default. After you create the Network Load Balancer, you can turn cross-zone load balancing on or off at any time. For more information, see How Elastic Load Balancing works.

To find out the Availability Zones that your container instances are configured for, complete the following steps:

  1. Open the Amazon EC2 console.
  2. In the navigation pane, under Auto Scaling, choose Auto Scaling Groups.
  3. Select the container instance Auto Scaling group that's associated to your cluster.
  4. On the Details tab, under Network, verify that the Availability Zones listed match the Availability Zones listed for your load balancer.

To modify the Availability Zones of your cluster, open the AWS CloudFormation console. Then, choose the CloudFormation stack for your cluster, and then update the stack. Lastly, under Specify stack details page, update your Subnet IDs configuration.

To find out the Availability Zones that your task elastic network interface for Amazon Virtual Private Cloud (Amazon VPC) is configured for, complete the following steps:

  1. Open the Amazon ECS console.
  2. In the navigation pane, choose Clusters, and then select the cluster that contains your service.
  3. On the Services tab of your cluster's page, in the Service Name column, select the service that you want to check.
  4. Choose the Configuration and Networking tab. To view the subnets that are configured for the service, check Subnets under Network configuration.
  5. Open the Amazon VPC console to view the subnets.
  6. Verify that the Availability Zones of the subnets match the Availability Zones listed for your load balancer.

Note: You can't change the subnet configuration of an Amazon ECS service from the Amazon ECS console. Use the AWS CLI update-service command to change the subnet configuration.

The network access control list (network ACL) associated with the subnets of your load balancer and ECS container instances or Amazon ECS task elastic network interface for awsvpc network mode are correctly configured

The subnets for your load balancer and your container instance or task elastic network interface might be different. To make sure that traffic is allowed between these subnets, review the following configurations:

  • Verify that the network ACL associated with the subnets for your load balancer allows ingress traffic on the ephemeral ports (1024-65535) and listener port. Also, verify that the network ACL also allows egress traffic on the health check and ephemeral ports.
  • Verify that the network ACL associated with the subnets for your container instance or task elastic network interface for awsvpc mode allows ingress traffic on the health check port. Also, verify that the network ACL allows egress traffic on the ephemeral ports.

For more information about network ACLs, see Work with network ACLs.

Check the health check settings of your target group

To make sure that the health check settings for your target group are correctly configured, complete the following steps:

  1. Open the Amazon EC2 console.
  2. In the navigation pane, under Load Balancing, choose Target Groups.
  3. Select your target group.
    Important: Use a new target group. Because Amazon ECS automatically registers and de-registers ECS Task with the target group, don't manually add targets to the target group.
  4. On the Health checks tab, enter the following information:
    Review the Port and Path fields. If the these fields aren't correctly configured, then Amazon ECS might ask your load balancer to de-register the task because of failing health checks.
    For Port, choose traffic port.
    Note: If you choose Override, then confirm that the port specified matches the task host port.
    For Timeout, make sure that the response timeout value is correct.
    Note: The response timeout is the amount of time that your container has to return a response to the health check ping. If this value is lower than the amount of time required for a response, then the health check fails.

Check the status and configuration of the application in your ECS container

Confirm that the application in your ECS container responds to your load balancer health check

To make sure that the application in your ECS container responds to your load balancer health check correctly, complete the following tasks:

  • Check that the ping port and the health check path for your target group are correctly configured.
  • Monitor the CPU and memory utilization metrics for the Amazon ECS service. For example, high CPU can make your application unresponsive and cause a 502 error or timeout.
  • Define a minimum health check grace period. This setting instructs the service scheduler to ignore the Elastic Load Balancing health checks for a pre-defined time period after a task has been instantiated. Your Amazon ECS task might require a longer health check grace period to register the Network Load Balancer.
  • Check your application logs for application errors. For more information, see Send Amazon ECS logs to CloudWatch.

Confirm that the application in your ECS container returns the correct response code

When the load balancer sends an HTTP GET request to the health check path, the application in your ECS container is expected to return the default 200 OK response code.

Note: If you use an Application Load Balancer, then you can update the Matcher setting to a response code other than 200. For more information, see Health checks for your target groups.

To confirm that the application in your ECS container returns the correct response code, complete the following steps:

  1. Use SSH, Session Manager, or Instance connect to connect to your container instance.

  2. (Optional) Install curl with the command appropriate for your system.
    For Amazon Linux and other RPM-based distributions, run the following command:

    sudo yum -y install curl

    For Debian-based systems (such as Ubuntu), run the following command:

    sudo apt-get install curl
  3. To get the container ID, run the following command:

    docker ps

    Note: The port for the local listener is displayed in the command output under PORTS at the end of the sequence.

  4. To get the IP address of the container, run the docker inspect command:

    $ IPADDR=$(docker inspect --format='{{.NetworkSettings.IPAddress}}' 112233445566)

    Note: The IP address of the container is saved in IPADDR. Use this command only if you use the BRIDGE network mode. Replace 112233445566 with the Container ID number of docker ps output. If you use awsvpc network mode, then use the task IP address assigned to the task elastic network interface. If you use the HOST network mode, then use the IP address of the host (container instance) that the task is exposed through.

  5. To get the status code, run a curl command that includes IPADDR and the port of the local listener. For example, if you run the curl command on a container listening on port 8080 with the health check path of /health, then the command must return the response code 200 OK.

    curl -I http://${IPADDR}:8080/health

If you receive a non-HTTP error message, then your application isn't listening to the HTTP traffic. You might receive an HTTP status code that's different from what you specified in the Matcher setting. If you receive a different status code, then your application is listening to the HTTP traffic, but not returning a status code for a healthy target.

Check the status of your container instance

If you get the following event message from your Amazon ECS service event, then check the status of your container instance:

"(service AWS-Service) (instance i-1234567890abcdefg) (port 443) is unhealthy in (target-group arn:aws:elasticloadbalancing:us-east-1:111111111111:targetgroup/aws-targetgroup/123456789) due to (reason Health checks failed)"

Check the status of your container instance in the Amazon EC2 console. If your instance fails the system status checks, then try stopping and starting your instance.

Related information

Create a target group

Use load balancing to distribute Amazon ECS service traffic

AWS OFFICIAL
AWS OFFICIALUpdated 5 months ago