aws target response time of alb become increase and unstable when target group moved to pods in eks from EC2 instances


hi, I try to migrate rails API server that running directly on EC2 instance to EKS pod.

The server runs with unicorn_rails that has 20+ worker, also it has nginx as frontend that receive requests and communicate with unicorn_rails via unix domain socket.

When it run directly on EC2 isntance and ALB with target group of these instances, latency always < 500ms and fluctuation range was only ±10 ms

But when it runs as pod on EKS, latency increases nearly a hundreds ms on average and fluctuation range become ±100 ms, even if it receives only a few request per second per pod. (for EKS nodepool, using same instance type of ec2 version)

Is it common for the latency of a request sent from alb to an eks pod to be this large or unstable compared to a request sent to an ec2 instances, even if request flow is nearly same?

belows are details of migration.

To migrate this server to eks pod, I use following 2 containers

  1. unicorn_rails with single worker of it (instead of 20+ workers, and we prepare larger number of pod)
  2. nginx as sidecar container of 1, receives requests from alb

container 1 and 2 communicate with unix domain socket like EC2 instance version of the server does.

I believe request flow nearly same between ec2 version and eks version, so cannot figure out why latency behavior is so different.

old: alb =(network)=> EC2 (nginx) =(unix domain socket)=> EC2 (unicorn rails worker)
new: alb =(network)=> EKS pod (nginx) =(unix domain socket)=> EKS pod (unicorn rails worker)

reducing unicorn_rails worker is critical? or kube-proxy adds overhead?

any thoughts?


1 Answer

A few things that you can check to start with are:

  1. Resource bottlenecks - Make sure the pods have sufficient CPU, memory, and I/O resources to handle the load from the ALB. Check CloudWatch metrics for any resource constraints on the pods or nodes.
  2. Network performance - There may be higher network latency between the ALB and pods compared to EC2 instances, depending on the VPC networking configuration. Ensure pods and nodes are on the same subnets/AZs as the ALB if possible.
  3. Readiness probes - The pods may not be fully ready when first receiving traffic from the ALB. Define readiness probes on the pods to delay traffic until containers are initialized.
profile pictureAWS
answered 2 months ago
  • hi, thank you for your response.

    Resource bottlenecks - due to cloudwatch metrics, CPU/network IO does not seem to be saturated. for memory usage, I cannot find cloudwatch metrics for memory, both by autoscaling group or instance, but nginx and unicorn_rails worker memory usage is stable and total memory usage calculated from these value should be smaller than total memory of each ec2 instance. Readiness probes - we defined readiness probe and it does not seem to fail during measurement period Network performance - we put services on same AZs as much as possible (except facility for recovering from AZ failure), but eks lives in different VPC (same AZ). does this make critical difference?

    any thought?


  • Generally, communication between resources in different VPCs will have slightly more latency than communication between resources within the same VPC. This is because the traffic must traverse the AWS global network instead of staying fully within a single VPC.

    However, the difference in latency is typically small since AWS routes traffic efficiently within its global backbone.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions