Skip to content

Is my EKS monitoring architecture with AMP, ADOT, Fluent Bit, Loki, and OSS Grafana correct? How do I estimate its cost?

0

EKS monitoring stack architechture diagram

I'm totally new to monitoring, but after reading a bunch of articles and resources on observability in Kubernetes, I tried to put together this EKS monitoring stack that combines different tools like ADOT, Fluent Bit, Amazon Managed Prometheus (AMP), Grafana OSS, and Loki (Grafana Cloud). We're currently running an EKS cluster and expect it to scale over time, so to avoid potentially high costs from CloudWatch Container Insights and log ingestion, we're exploring this more open-source-centric approach that selectively uses AWS managed services. I’d really appreciate feedback—does this architecture look correct and feasible for production use? Also, how do I go about estimating the costs involved with AMP, Loki, S3 (for cold storage), and running Grafana OSS?

1 Answer
0

Your proposed EKS monitoring architecture using AMP, ADOT, Fluent Bit, Loki, and OSS Grafana is a valid approach for monitoring an EKS cluster. This setup combines AWS managed services with open-source tools, which can provide a flexible and cost-effective solution for monitoring and observability.

The architecture you've outlined includes:

  1. AWS Distro for OpenTelemetry (ADOT) for metrics collection
  2. Amazon Managed Service for Prometheus (AMP) for metrics storage and querying
  3. Fluent Bit for log collection
  4. Loki (Grafana Cloud) for log aggregation and querying
  5. Open Source Grafana for visualization
  6. S3 for cold storage of logs

This combination allows you to leverage the scalability and managed aspects of AMP while using open-source tools for other components, potentially reducing costs compared to a full CloudWatch-based solution.

To estimate costs for this architecture:

  1. Amazon Managed Service for Prometheus (AMP):
  • Costs are based on ingestion rate, storage, and querying.
  • You'll need to estimate your metrics volume and retention period.
  1. S3 for cold storage:
  • Costs depend on the amount of data stored and the storage class used.
  • Consider lifecycle policies to move data to cheaper storage tiers over time.
  1. Loki (Grafana Cloud):
  • Pricing is typically based on log ingestion rate and retention period.
  • Check Grafana Cloud's pricing page for current rates.
  1. Open Source Grafana:
  • Costs here will primarily be the compute resources to run Grafana in your EKS cluster.
  1. EKS cluster resources:
  • Factor in the additional CPU and memory required to run ADOT and Fluent Bit on your nodes.

To get an accurate cost estimate, you'll need to:

  1. Estimate your metrics and log volumes
  2. Determine your retention requirements
  3. Estimate query volumes for both metrics and logs
  4. Calculate the additional cluster resources needed for running the monitoring components

Remember that while this architecture can be cost-effective, it does require more management overhead compared to fully managed solutions like CloudWatch. Ensure you have the necessary expertise to maintain and troubleshoot this stack in a production environment.

As your cluster scales, regularly review and optimize your monitoring setup to ensure it remains cost-effective and meets your observability needs.
Sources
Prometheus monitoring on Amazon EKS - AWS Prescriptive Guidance
Monitoring tools for Amazon EKS - AWS Prescriptive Guidance
Monitoring - Implementing Microservices on AWS
How does KRO integrate with monitoring and logging systems? | AWS re:Post

answered 6 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.