There are two things to check here:
First, ephemeral port numbers are a 16-bit number which means that (all other things aside) you can open a maximum of 65,536 TCP connections from A to B if you are using the same destination port on B (say, port 80). You can open a lot more than that (kernel limits permitting) if you are using different destination ports.
I'm guessing from your comment about 250k connections that you are using different source and destination port numbers - but I'm a little confused about how you are reaching a private EC2 instance from an external source.
Second: It sounds like you're using NAT Gateway - and my guess is that your routing within the VPC between the two private instances is also (somehow) using NAT Gateway - but it probably shouldn't. NAT Gateways support up to 55,000 connections to each unique destination - it's called out in our documentation - and that number is suspiciously close to your 51k number you're hitting. So it's worthwhile checking that out.
Relevant content
- asked 9 months ago
- Accepted Answerasked a year ago
- asked 3 months ago
- Accepted Answerasked 2 years ago
- AWS OFFICIALUpdated 7 months ago
- AWS OFFICIALUpdated 5 months ago
- AWS OFFICIALUpdated 9 months ago
The test program I'm using opens connections to multiple ports (about 100 different ports) on the server. I've done tests to see the number of ephemeral ports in use and it never gets about about 510 per (src ip, src port, dest ip, dest port) combo. Plus I never get any errors in logs about ephemeral port exhaustion so I'm pretty sure it's not that.
I thought the NAT gateway might be the issue as well, but I can't see any traffic being dropped by the NAT gateway in the flow logs.
As for the inbound connections, I'm not testing those against the private server. I test those using the public instance as the "server". So I just open the pots and use the elastic IP to access to instance. I've also tried going the other way between the ec2 instances (using the public instance as the server and the private one as the client) and I have the same problem.
Flow Logs won't show the traffic being dropped by NAT Gateway so that might still be happening.