Unusually high network traffic of aws-node pods

0

On a high level we monitor network traffic in our EKS clusters through the container_network_transmit_bytes_total and container_network_receive_bytes_total metrics provided by cadvisor. Recently, while investigating network usage and trying to reduce the costs for NatGateway-Bytes and DataTransfer-Regional-Bytes, I stumbled upon unusually high network usage of the aws-node and kube-proxy pods. Looking at the utilization, the following queries

sum (increase(container_network_transmit_bytes_total{pod=~"aws-node.*",cluster="prod"}[24h])) / 1024^3
sum (increase(container_network_transmit_bytes_total{pod=~"kube-proxy.*",cluster="prod"}[24h])) / 1024^3

both return the same usage (up to GB precision) of 17818 GB/day. Compared to the residual traffic of the cluster

sum (increase(container_network_transmit_bytes_total{pod!~"kube-proxy.*|aws-node.*",cluster="prod"}[24h])) / 1024^3

which returns 9817 GB/day, this seems unusually high. I could not find reasons that would justify these numbers online. To my understanding kube-proxy just creates rules in the iptables of the nodes to forward packets to the correct pods/services, but from these findings it seems to me that the packets are actually routed through kube-proxy? Is there any way to debug this further, or can somebody please enlighten me about this high network usage of aws-node and kube-proxy? Also what is the reason that these two pods report almost identical network usage?

Best, Sam

Sam
已提問 1 年前檢視次數 114 次
沒有答案

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南