Scalling the deployments to 0 based on HTTP requests

0

Hello,

I have some test cluster , Basically, I want to scale down the deployments to 0 , if there is no request in Load balancer for 30 mins/ no requests to Service IP of Deployment. Can some one confirm how we can create metrics for number of request hit the Service IP for past 1hour. Once we gor metrics , let me know if we need to use HPA/KEDA for scalling down the deployments.

3 Answers
0

To scale Kubernetes to zero pods effectively, several steps and components are involved:

  1. Collecting Metrics and App Traffic Measurement: Begin by gathering metrics and monitoring the traffic flow of the application. One approach is to introduce a proxy in front of the application. This proxy can temporarily buffer incoming requests when the app is scaled down to zero and then forward those requests when the app becomes active.

  2. Additional Proxy Container: In Kubernetes, this proxy can be implemented as an extra proxy container within the Pod.

  3. Metrics Handling: Once you have collected the necessary metrics, you need two essential components:

    • Metrics Server: Kubernetes does not include a built-in metrics server by default, so you need to set up a separate metrics server to store and aggregate the collected metrics.
    • Metrics Export: Implement a mechanism to export the collected metrics to the metrics server for further analysis and utilization.
  4. Autoscaler Configuration: The final step is configuring the Horizontal Pod Autoscaler (HPA) to leverage the metrics from the metrics server and adjust the replica count of your application accordingly.

  5. Using KEDA for Simplification: To streamline these processes and components, Kubernetes offers KEDA, an event-driven autoscaler that implements the Custom Metrics API. KEDA comprises three key components:

    • Scaler: Scalable adapters that can collect metrics from various sources such as databases, message brokers, and telemetry systems. KEDA has a specialized scaler that creates an HTTP proxy for measuring and buffering incoming requests.
    • Metrics Adapter: Responsible for exposing the metrics collected by the HTTP scaler in a format compatible with the Kubernetes metrics pipeline.
    • Controller: The KEDA controller brings all the components together and performs the following tasks:
      • Collects metrics using the adapter and exposes them to the metrics API.
      • Registers and manages KEDA-specific Custom Resource Definitions (CRDs).
      • Creates and manages the Horizontal Pod Autoscaler (HPA).
  6. Using HTTPScaledObject with KEDA: With KEDA, you don't directly create an HPA. Instead, you create an HTTPScaledObject, which is a wrapper around the HPA. It includes:

    • Instructions on how to connect to the source of the metrics.
    • Guidelines for routing traffic to the application.
    • Settings for creating the Horizontal Pod Autoscaler.

By adopting KEDA, you simplify the process of scaling Kubernetes to zero pods, as it provides a comprehensive solution that integrates metrics collection, metrics server, proxy handling, and HPA configuration into one cohesive framework.

profile pictureAWS
answered 8 months ago
  • Thank you for your answer, Can you provide some guidelines on Using KEDA with Prometheus Operator for collecting metrics, instead of KEDA HTTP add-on, basically I am new to promethues, I am just struggling in collecting metrics for service endpoint.

  • From Keda docs, it recommends to use the HTTP Addon since it adds first-class, end-to-end support for HTTP. But, You can build your own custom HTTP autoscaling system using the Prometheus scaler.

0

Hi,

I don't know the metrics you could obtain from the K8s directly, e.g. https://github.com/kubernetes/kube-state-metrics/blob/main/docs/service-metrics.md has nothing valuable for your case.

What about implementing metrics on the application side and, based on them, scale it to zero using KEDA? Of course, you will not be able to scale all the components because you need one (which exposes these metrics). If you are using ingress-nginx you might be able to get the metrics you need from it.

profile picture
EXPERT
answered 8 months ago
  • Thank you for your response , Can you provide the documentation for ingress-nginx for the metrics it exposes.

0

Hi,

Get the deployment pods to 0, when you have zero incoming requests to the load balancer using KEDA(Kubernetes Event-Driven Autoscaling). KEDA perform autoscaling based on the number of events, custom metric scraped breaching a specific threshold or when there is message in Kafka queue. Refer this tutorial to implement KEDA in the EKS(Elastic Kubernetes Service) Cluster: https://aws.amazon.com/blogs/mt/proactive-autoscaling-of-kubernetes-workloads-with-keda-using-metrics-ingested-into-amazon-cloudwatch/

Scale the EKS deployment pods by querying the metrics stored Amazon Managed Service for Prometheus. Please refer the blog: https://aws.amazon.com/blogs/mt/proactive-autoscaling-kubernetes-workloads-keda-metrics-ingested-into-aws-amp/

Because HPA(Horizontal Pod Autoscaler) in Kubernetes scale down the pods to 1, if you have zero incoming requests.

Comment here for further questions.

profile picture
answered 8 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions