Updating an NLB target group attribute does not apply to all connections?

0

Strange behavior.

I had "proxy protocol v2" enabled on one of my NLB target group and decided to remove it. So I turned if off via the AWS Console webpage and set my service not to parse it. In about 2min, about 95% of the connections switched correctly (progressively), but a small amount still seems to use it. My service prints the first few bytes it receives, and it is the proxy protocol data.

It's been over 2 hours now the config is supposed to be off, and I still receive some rare packets with the proxy protocol active.

Does NLB only reapply the new configuration on new connections and keep the old ones for connections that remains active? Is there's a way to force a full reset?

Edit: I killed my nginx pods which forced to unregister them and register new targets in the NLB target groups while keeping the same attribute configuration and the errors stopped. I still believe there's an issue/bug with hot switching the configuration that cause it not to apply correctly to all new packets.

Dunge
asked 9 months ago397 views
1 Answer
1
Accepted Answer

Hello,

When you disable target group attribute "proxy_protocol_v2.enabled", it can take up to ~120 seconds for the new attribute value to be applied. So you should no longer see NLB inserting proxy protocol headers to new connections after ~2 minutes of updating the target group.

Also proxy protocol headers are only exchanged at the start of a new connection so any proxy protocol information that was exchanged remains valid throughout the lifetime of the connection. Long lived connections will then still have proxy protocol attributes associated with them while all new connections formed after disabling proxy protocol will no longer have proxy protocol attributes associated with it.

With TCP listeners, the load balancer prepends a proxy protocol header to the TCP data. It does not discard or overwrite any existing data, including any incoming proxy protocol headers sent by the client or any other proxies, load balancers, or servers in the network path [1], so we should make sure the host communicating with the NLB is not inserting its own PPv2 headers.

You should also check, if the target instance registered to NLB where you had disabled 'proxy protocol v2' are not registered to any other NLB’s Target Group that may still have PPv2 enabled.

To look into the issue and confirm if NLB is still inserting PPv2 information to new connection, even after disabling PPv2 we would need to review NLB and the packet capture from target instance.

Keeping in mind your data privacy, you can open a technical support case with AWS using the link [2] and share packet capture of targets attached to NLB which provides details regarding PPv2 information being inserted to new connection even after disabling "proxy protocol v2" attribute. Upon checking the corresponding resources, AWS Premium Support engineers will be able to provide insights and assist you accordingly.

[1] https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-target-groups.html#proxy-protocol

[2] AWS Support Center - https://console.aws.amazon.com/support

AWS
answered 8 months ago
  • Thank you. "proxy protocol headers are only exchanged at the start of a new connection so any proxy protocol information that was exchanged remains valid throughout the lifetime of the connection." is probably the symptom I was seeing. Although I was seeing that using UDP and UDP is technically "connectionless", it probably probably remained active with some keep alive because devices communicated every minute. Only by forcefully cutting the link (killing the receiving nginx pods) was I able to stop it.

    I can confirm there was no other nlb or target group and that the host was not inserting their own headers. Unfortunately I do not have the infrastructure in place with the capabilities to easily do a packet capture on my target instance (which is a kubernetes pod).

    In any case, for my personal situation I won't have to switch this setting regularly so this is not an issue anymore. I just wanted to report the behavior since it was not something I was expecting. It created service outage only for some specific incoming connections, and caused headaches trying to diagnose the cause.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions