Unusually high network traffic of aws-node pods

0

On a high level we monitor network traffic in our EKS clusters through the container_network_transmit_bytes_total and container_network_receive_bytes_total metrics provided by cadvisor. Recently, while investigating network usage and trying to reduce the costs for NatGateway-Bytes and DataTransfer-Regional-Bytes, I stumbled upon unusually high network usage of the aws-node and kube-proxy pods. Looking at the utilization, the following queries

sum (increase(container_network_transmit_bytes_total{pod=~"aws-node.*",cluster="prod"}[24h])) / 1024^3
sum (increase(container_network_transmit_bytes_total{pod=~"kube-proxy.*",cluster="prod"}[24h])) / 1024^3

both return the same usage (up to GB precision) of 17818 GB/day. Compared to the residual traffic of the cluster

sum (increase(container_network_transmit_bytes_total{pod!~"kube-proxy.*|aws-node.*",cluster="prod"}[24h])) / 1024^3

which returns 9817 GB/day, this seems unusually high. I could not find reasons that would justify these numbers online. To my understanding kube-proxy just creates rules in the iptables of the nodes to forward packets to the correct pods/services, but from these findings it seems to me that the packets are actually routed through kube-proxy? Is there any way to debug this further, or can somebody please enlighten me about this high network usage of aws-node and kube-proxy? Also what is the reason that these two pods report almost identical network usage?

Best, Sam

Sam
已提问 1 年前115 查看次数
没有答案

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则