TGW appliance mode random asymmetric flow between VPCs

0

I'm testing a AWS solution trying to achieve the on premises inbound WAN->DNAT->LAN with multiple service ports. As the on premises uses Active-Passive firewalls I've created a MultiAZ Ingress VPC with firewall appliances (same OS from on premise) to achieve Active-Active firewalls. This Ingress VPC is attached to Services VPC using TGW. Both firewalls has EIP/Public/Private subnets using DNAT to Service VPC's IP using internal static routing to Private Subnet and Ingress VPC attachment is on Applicance Mode. I can establish connections to Service VPC through any AZ firewalls but sometimes the SYN/ACK is sent to the wrong firewall causing asymetric flow. I can connect 5-10 times in a row and then next connection blocks and I can see in packet inspect the incoming SYN/ACK on the opposite firewall. Any ideas? Thanks

2 Answers
0

You describe the scenario precisely. What I do not understand is why it works most of the times but sometimes not. What I understand from the Appliance Mode is that TGW will return the packet to the original Ingress VPC TGW ENI using hash algorithm and in some cases it's not working. Anyway, thank you for the answer.

answered 3 months ago
0

There's lot to unpack in this question and it's not 100% clear to me exactly what you've configured so I'm going to have to guess a little. A diagram would be really helpful here.

What I think you're doing is having ingress traffic from the internet come into a couple of firewalls, each of which has an EIP associated with it. Sessions come into the firewall and the destination address is NATted to "something" (load balancer perhaps) in your services VPC which is connected via Transit Gateway.

Through this you're seeing asymmetric flows but only sometimes.

The problem you're seeing is because the return traffic is going to have a source IP address which is public (I think - again, I'm guessing here). So the return traffic from the services VPC will be going to a public IP address (the original client's IP address). I'm not sure how you have the routing set up but what is probably happening is that traffic is "landing" in a random AZ and being routed to the firewall in that AZ. Which may not be the firewall that the original session came through - hence the asymmetric flows.

The easiest way to fix this is to do source NAT on the sessions coming in through the firewalls so that the network traffic coming from the firewalls going to the services VPC appears to come from the private IP address of the firewall.

Edit: A colleague pointed out that we do document this - in the highlighted section at the top:

Traffic can drop if the centralized VPC receives the traffic from a different gateway — for example, an Internet gateway — and then sends that traffic to the transit gateway attachment after inspection.

Bigger picture though: It sound like you're trying to build a centralised ingress solution using firewalls. I'm strongly opinionated about this and I think that it isn't a good model because those firewalls become a significant single point of failure - even though you have two, failover and scaling is a challenge. You can hear me talk more about this in detail on the AWS Podcast.

profile pictureAWS
EXPERT
answered 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions