Gaps in CloudWatch Detailed Monitoring on EC2 instance stores


I have a customer who recently moved to using M5D instance types to make use of the on-board instance stores. With the stores, the customer saw much better performance, overall, but they began to see some throttling/performance problems at peak load.

With regular (Cloudwatch) monitoring, they could see the "flat line" throttling in the DiskReadOps and DiskWriteOps over some period of time during heavy work. We discussed it, and they we ok with the throttling, so that's not the problem. The problem is when they turned on detailed monitoring (1 min. vs 5 min logging), they started to see gaps in the their graphs.

How do I even begin to explain it when I don't understand why they would have those gaps?

I can provide a sample graph the customer sent, if that will help.

asked 6 years ago885 views
1 Answer
Accepted Answer

There could be several reasons for why a datapoint goes missing in a cloudwatch metric such as:

  1. Transient issue with your instance or the network

  2. Nature of the metric - We do not record datapoint if the metric value is 0 so there will be a gap in metric if metric value = 0

  3. backfilling of delayed metrics

If the issue is backfilling of metrics (which seems to be the case here based on the symptoms), the customer should see datapoints populate in some time. If not, they can open a case with Premium Support and CloudWatch Support Ops can take a closer look.

answered 6 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions