Write IOPs incorrectly report ~3X higher after migration to db.m5.xlarge

0

We upgraded from db.m4.xlarge to db.m5.xlarge last night and saw an immediate 3X increase in Write IOPS according to the AWS/RDS WriteIOPS metric (the standard, not the enhanced monitoring). We also upgraded our MySQL DB Engine version to 5.6.41 from 5.6.37-39 or so.

I've found other posts saying that the WriteIOPS metric is multiples too high for db.m4 instances, but our issue is that it seems the m5 instance is multiples too high based on the m4 instance values.

I have two concerns:

  1. Are we being billed on this miscalculated higher amount?
  2. How can we see the real WriteIOPS without paying for enhanced monitoring, which we currently have no need for?

Is there a fix?

Related posts:
https://forums.aws.amazon.com/thread.jspa?messageID=877494
https://forums.aws.amazon.com/thread.jspa?messageID=850686&tstart=0
https://forums.aws.amazon.com/thread.jspa?messageID=848354&tstart=0

Edited by: wrp on May 2, 2019 12:30 PM

wrp
asked 5 years ago343 views
16 Answers
0

Perhaps Amazon staff can provide a suggestion?

wrp
answered 5 years ago
0

We are aware of the discrepancy in WIOPS. You are not being billed since the excess IOPS are not actually being expressed on the underlying volume.
In your case you actually have 4 striped gp2 volumes comprising your RDS allocated volume. This means you have 12k burst IOPS and are not coming anywhere close to exhausting your burst balance.
thanks,
-Phil

AWS
MODERATOR
philaws
answered 5 years ago
0

Thanks for the information, Phil. To follow up:

  1. We are seeing that our burst balance has dropped since this upgrade date and is not recovering. Is the Burst Balance metric also incorrect, or are we running the risk of having performance outages as a result of this bug? This is very scary for us.

  2. Do you have an ETA for when this bug will be fixed? It seems it has been open since at least February 8, 2018 (15 months).

  3. In the meantime, how can we keep track of how close we are to our WIOPS capacity without upgrading our monitoring?

Edited by: wrp on May 7, 2019 10:43 AM

wrp
answered 5 years ago
0

I am sorry I am not able to provide a date.

When did you see the BurstBalance drop? I can see that your BurstBalance has been near 100% for the past week.
I can also share that you have 4 striped volumes, each of which has 100% BurstBalance.
The best way for you to monitor this is through the cloudwatch metric: BurstBalance

-Phil

AWS
MODERATOR
philaws
answered 5 years ago
0

The burst balance was at 100 until 2019-05-01 18:00 Pacific. At that time, it dropped to 99 and has stayed there, with the exception of 2019-05-07 8:28 Pacific at which point it dipped slightly lower, possibly due to a slight increase in WIOPS. The burst balance has not returned to 100 since the upgrade, whereas before it was at 100 with the exception of a few windows where we were nearing our IOPS limits.

wrp
answered 5 years ago
0

I'm experiencing a similar issue after upgrading from t2.small to t3.small. Just wanted to report that this issue seems to affect t2/t3 servers as well.

Would be happy to provide instance id if it helps with diagnosing the issue. From the other reports, I understand that there should be no impact to performance/billing.

answered 5 years ago
0

Is it possible to get any clarity on this, Phil? I'm concerned primarily by the dip in our burst balance... particularly when it lowers at certain points when there is higher IOPS. We should have plenty of capacity and the burst should not be lowering if it is truly based on the actual IOPS, not the mis-reported IOPS.

This is a significant bug that affects our understanding of services consumed and billing. This should be prioritized higher than it currently is, seeing it's been an issue for over a year.

Edited by: wrp on May 14, 2019 10:29 AM

wrp
answered 5 years ago
0

It's happening to me as I have a same problem of iops, I have been amounting one increment of 10 X on last period of 15 days, its big shot very high and I do not know how to stop or manage them.

You’re invite see next tread https://forums.aws.amazon.com/thread.jspa?threadID=275498&tstart=0

I´m crazy finding any solution or help in any part of WWW

Regards
Juan de Dios

answered 5 years ago
0

It's too bad that AWS isn't able to help out with anything useful here. We're still seeing dips in our CloudWatch metrics which indicate that our burst balance is being used due to this bug. This is going to be a serious issue when we start getting billed as though our usage is 3X higher.

AWS Engineers/Support: Please help! Having an open bug like this since Feb 8, 2018 is unacceptable.

wrp
answered 5 years ago
0

This is an issue with the way metrics are measured. The excess does not count against your burst balance. And as we reviewed earlier, your BurstBalance is never less that 99%, which is within rounding error of full BurstBalance. I understand this issue prevents a clear view of your metrics, until this issue can be rectified I still recommend you use BurstBalance as a proxy to measure IOPS depletion. Again, with your current workload for the past several months, I dont see any incidents where BurstBalance dropped below 99%.

Sorry for the inconvenience.

-Phil

AWS
MODERATOR
philaws
answered 5 years ago
0

Unfortunately, it already affected us, in the billing of last month having a incremento of 200X, by the consumption of IOP´s on AWS RDS, billing has risen drastically, and I have already raised a service case, but to date nobody has attended it in the support center.

I think they me get crazy with this, after all, customers are affected by the billing of IOPS, I invite you to follow this thread at the address:

https://forums.aws.amazon.com/thread.jspa?threadID=304095&tstart=0

in which I will be updating everything that the technical support people tell me.
regards
Juan de Dios

answered 5 years ago
0

Juan de Dios,
It looks like you are using Aurora MySql. You can PM me with details of your issue, but I am sure that the issue reported here is not related to Aurora MySQL clusters. The measurement of IOPS between the two different storage options are unique.

-Phil

AWS
MODERATOR
philaws
answered 5 years ago
0

UPDATE

The problem continues to get worse. We are now seeing our burst balance dip into the 90% during regular usage which should be well below our allotted IOPS. Still no help from AWS. I hate to be a squeaky wheel here, but we really need some help!

Support case 6196608661 has been opened.

Edited by: wrp on Jun 25, 2019 10:24 AM

wrp
answered 5 years ago
0

I still have not heard back from Amazon on the case I opened ~48 hrs ago, despite the advertised < 24 hour response time. I opened up a subsequent case yesterday of higher priority, which should have received a response in < 12 hr but have not heard back on that either. This is for paid support.

I'm not quite sure what my options are at this point.

wrp
answered 5 years ago
0

For the response times listed, I can imagine that you have the support plan of 100 USD minimum, for this level of support you must have a support phone number, this information is displayed on you panel of support on secction of contact methods.

If is the support pay have level support DEVELOPER, then only wait, you don´t have more options to make it.

Verify your console there are the technical support contact details!

Regards

answered 5 years ago
0

We have the developer support plan, with listed response times:

General guidance:
< 24 hours

System impaired:
< 12 hours

source: https://console.aws.amazon.com/support/plans/home#/

Still checking the console religiously :)

wrp
answered 5 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions