- Newest
- Most votes
- Most comments
Hello.
There is a custom healthcheck for the target group. Sometimes the nodes become unhealthy. They still appear in the ASG. How do I ensure traffic is not send to these unhealthy nodes? Should it work by default?
NLB will not send traffic to instances with failing health checks when there are multiple EC2 targets.
https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-troubleshooting.html
This occurs when all registered targets are unhealthy. If there is at least one healthy registered target, your Network Load Balancer routes requests only to its healthy registered targets.
When there are only unhealthy registered targets, the Network Load Balancer routes requests to all the registered targets, known as fail-open mode. The Network Load Balancer does this instead of removing all the IP addresses from DNS when all the targets are unhealthy and respective Availability Zones do not have healthy target to send request to.
Also, by adding an ELB health check to the AutoScaling group, it is possible to automatically delete EC2 that has become unhealthy.
https://docs.aws.amazon.com/autoscaling/ec2/userguide/health-checks-overview.html#elastic-load-balancing-health-checks
Also, when a new node comes up is there a way not to send traffic to it for 30 seconds or similar?
By setting the AutoScaling group lifecycle hook, it is possible to wait 30 seconds after EC2 starts.
https://docs.aws.amazon.com/autoscaling/ec2/userguide/lifecycle-hooks.html
In your setup with Route53, Network Load Balancer (NLB), Target Group, and Auto Scaling Group (ASG) containing EC2 instances, the routing of traffic to healthy nodes should work by default. However, there are a few things to consider and configure to ensure optimal behavior:
-
By default, the NLB should only route traffic to healthy targets in the target group. If you have a custom health check configured for your target group, the NLB will use this to determine which instances are healthy and should receive traffic.
-
To ensure that traffic is not sent to unhealthy nodes, make sure your health check is properly configured and accurately reflects the health of your instances. The NLB will automatically stop routing traffic to instances that fail the health check.
-
For new nodes coming up in your ASG, there are a couple of approaches you can take to delay sending traffic to them:
a. Configure a warm-up time for your target group. This setting allows you to specify a period during which the load balancer will send a linearly increasing share of traffic to a newly registered target. This gives the new instance time to warm up before receiving a full share of traffic.
b. Use the deregistration delay setting in your target group. While this is primarily used for graceful shutdown, you can also leverage it to delay traffic to new instances. Set the deregistration delay to your desired warm-up time (e.g., 30 seconds). When a new instance is added, it will initially be in the "draining" state for the specified delay period before becoming fully active.
-
If you're still experiencing issues with traffic being sent to unhealthy nodes, double-check your ASG and NLB configurations. Ensure that the ASG is properly integrated with the target group and that the health check settings are consistent between the ASG and the target group.
-
You can also consider adjusting the target group attribute "target_group_health.unhealthy_state_routing.minimum_healthy_targets.count". This allows you to specify a minimum number of healthy targets. If the number of healthy targets falls below this threshold, the NLB will send traffic to all targets, including unhealthy ones, to prevent complete service outage.
By carefully configuring these settings, you can ensure that traffic is only routed to healthy nodes and that new instances have time to warm up before receiving a full traffic load.
Sources
Target groups for your Network Load Balancers - Elastic Load Balancing
Asg Setup for Ec2 instance downtime error!! | AWS re:Post
ASG EC2 HealthCheck | AWS re:Post
Relevant content
- asked 3 years ago
- asked 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago