- Newest
- Most votes
- Most comments
Hello.
Can I check the CloudWatch metric "RejectedConnectionCount" for ALB?
This metric is recorded when there is a sudden increase in access and there are connections that could not be processed due to the limit on the number of connections on the ALB side.
Therefore, if this metric is recorded, it is possible that the number of concurrent accesses has increased and could not be processed.
https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-cloudwatch-metrics.html
The number of connections that were rejected because the load balancer had reached its maximum number of connections.
ALB automatically scales according to the number of accesses, etc., but if a large amount of access occurs before scaling, it may not be able to scale in time and an error may occur.
https://docs.aws.amazon.com/elasticloadbalancing/latest/application/application-load-balancers.html
To ensure that your load balancer can scale properly, verify that each Availability Zone subnet for your load balancer has a CIDR block with at least a /27 bitmask (for example, 10.0.0.0/27) and at least eight free IP addresses per subnet. These eight IP addresses are required to allow the load balancer to scale out if needed. Your load balancer uses these IP addresses to establish connections with the targets. Without them your Application Load Balancer could experience difficulties with node replacement attempts, causing it to enter a failed state.
Note: If an Application Load Balancers subnet runs out of usable IP addresses while attempting to scale, the Application Load Balancer will run with insufficient capacity. During this time old nodes will continue to serve traffic, but the stalled scaling attempt may cause 5xx errors or timeouts when attempting to establish a connection.
Relevant content
- asked 2 years ago
- asked 3 years ago
- Accepted Answerasked 2 years ago
- AWS OFFICIALUpdated 6 months ago
- AWS OFFICIALUpdated 23 days ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 7 months ago
hi, I checked for RejectedConnectionCount metric it didn't had any datapoint recorded so assuming it as 0.
On further investigation found that the test which i was doing using apache bench on container was giving non 200 response(ie 426 status code but using curl it was working). I then tried using another tool(k6) to test directly against the container, turns out 503 and high latency was present on direct use of container as well so looks like its not an issue with load balancer.
To find the root cause i tried to create another service without any alb with the same task definition in isolation, i found that whenever i am enabling service connect this issue of high latency and 503 was happening.
Issue of hight latency got resolved when i added the option of mode of logs collected to non-blocking or disabling the log option presented while adding service connect.
Issue of some requests giving 503 when service connect is enabled is still a mystery, or is it a normal thing when using service connect(tried using 4vcpu and 8gb ram as well but no improvement in numbers, around 1% req fired using load test tool are 503)