ARP resolution does not work as intended within VPC when doing L2 Announcements with CIlium CNI on Kubernetes Cluster spanning EC2 instances across subnets.

VPC Configuration

VPC CIDR: 10.0.0.0/16
Region 1: 10.0.0.0/24 (public), 10.0.64.0/24 (private)
Region 2: 10.0.16.0/24 (public), 10.0.80.0/24 (private)
Region 3: 10.0.32.0/24 (public), 10.0.96.0/24 (private)

EC2 Configuration

Instance A: Deployed on private subnet 10.0.64.0/24 in Region 1. Acts as the control-plane node in my kubernetes cluster.
Instance B: Deployed on private subnet 10.0.96.0/24 in Region 3. Acts as the worker node in my kubernetes cluster.
Instance C: Deployed on public subnet 10.0.16.0/24 in Region 2. Acts as the worker node in my kubernetes cluster.
Instance D: Deployed on public subnet 10.0.0.0/24 in Region 1. Acts as the test machine.

Kubernetes Setup

I've setup a kubernetes cluster with a Instance A as the master node, Instance B as the private worker node and Instance C as the public worker node. I'm using Cilium CNI with VXLAN routing and I've enabled Cilium's L2Annoucements feature. I've deployed a nginx deployement with an nginx service called nginx-svc of type LoadBalancer. I've also created a CiliumLoadBalancerIPPool resource in my cluster that will grant any services an External IP from subnet 10.0.128.0/24 to services of type LoadBalancer.

I chose 10.0.128.0/24 because it was unused and wouldn't conflict with my existing VPC subnets.

Problem

As expected, my nginx-svc received an External IP from the virtual subnet 10.0.128.0/24. Let's say this External IP was 10.0.128.1. When I do a curl http://10.0.128.1 from Instance A, Instance B and Instance C, I'm able to access my nginx-svc. However, when I do curl http://10.0.128.1 on Instance D that isn't joined to my kubernetes cluster I'm unable to access the service and the request times out. This is the problem. I've read into how the L2Annoucement feature works and it does so by sending an ARP reply to the router responsible for LAN CIDR (such as 10.0.0.0/16 in AWS case) to make it aware of the usage of virtual IP 10.0.128.1 such that ARP requests from other instances (Instance A,B,C,D) within the same LAN (10.0.0.0/16) are forwarded to the MAC address of instance running the service.

This exact same setup works locally when I run it on QEMU and I'm able to access 10.0.128.1 even from virtual machines that are not joined into the cluster but this fails when running on AWS VPC. I am not entirely sure why this fails on AWS VPC.

The reason I want to to access service running on the virtual IP 10.0.128.1 from Instance D which isn't joined to the cluster is so I can create a DNAT/SNAT rule on Instance D via iptables and have it forward traffic from/to it's public IPv4 address to/from the private service address 10.0.128.1 accessible via LAN/VPC thereby simulating a public facing service.

Topics

Networking & Content Delivery Compute Containers

Relevant content

cni plugin not initialized on nodes created by Karpenter on EKS cluster with VPC-CNI add on
Srujan
asked 6 months ago
Amazon Elastic Kubernetes Service (EKS) maximum pod calculation with secondary VPC CIDR
Accepted Answer
Shawn_O
asked 4 years ago
VPC Peering not working as expected
AWS-User-msharma
asked 2 years ago
AWS VPC CNI AddOn stuck creating
emch
asked a year ago
How do I configure the Amazon VPC CNI plugin to use an IP address in VPC subnets with Amazon EKS?
AWS OFFICIALUpdated 4 months ago
How do I modify the IPv4 CIDR block of my Amazon VPC?
AWS OFFICIALUpdated 2 years ago
How can I configure NAT on my VPC CIDR for traffic traversing a VPN connection?
AWS OFFICIALUpdated 2 years ago
How does enhanced VPC routing work in Amazon Redshift?
AWS OFFICIALUpdated 2 years ago
Enabling Secure Connectivity between Overlapping On-Premise Networks and AWS VPC through Pilot VPC and Advanced Twice NAT Technology
EXPERT
Vinay Gugueoth
published 10 months ago
How can we map entire AWS VPC CIDR to a single IP address using Private NAT Gateway via AWS Site to Site VPN connection.
EXPERT
Narinder Singh Kharbanda
published 10 months ago