How to configure multiple RDS instance (100+) alerts in cloud watch for CPU utilization, Free Storage space etc?

0

I have more than 100+ RDS instances, I want to configure alerts like high CPU Utilization, low storage space, etc at once and those alerts should be received in PagerDuty.And those alerts in PagerDuty should show the name of the instance who has high CPU Utilization, low storage space or anything. How to do this?

2 個答案
1

We have done exactly this using terraform for instances created or not created via terraform. We also use terraform to perform the configuration of all of our pager duty environment.

The best you can do is script alarm creation by looping through each RDS instance and then create each alarm while naming each alarm accordingly.

The alarm action should be to sent to SNS topic. Of which PagerDuty is subscribed via https

I would write a bash script or use IAC and loop through each instance and create the alarm.

Write it once over perhaps 5-10 lines of code for this to create all your alarms.

The SNS payload is sent to the PagerDuty service where the details of which instance triggered the alarm is visible in the incident.

profile picture
專家
已回答 8 個月前
profile picture
專家
已審閱 8 個月前
0

Is it a one-off or are you creating new RDS instances regularly?

As a one-off, you could simply loop on existing metrics in the RDS namespace and create alarms on them. You could do that from the CLI or a Lambda, using the list metrics API to listmetrics in the RDS namespace and putmetricalarm to create the alarms.

If you create RDS instances regularly:

  • if you create them from infrastructure as code, e.g. Terraform or CloudFormation, is there a reason that prevents you from creating the alarms from the same stacks?
  • if you don't control the creation, you can listen to events that indicate new RDS instances creation and trigger a lambda on those events, implement the alarm creation in the lambda
profile pictureAWS
Jsc
已回答 8 個月前
  • it's one off, and i'm going to keep them. Can you elaborate each step in detail?cause i do not find any option to select all instance at same time....I did create an alarm but its notification did not mention the name of the instance who had shoot its CPU high

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南