How to create a composite alarm that reduces alarm noise during a deployment window for ecs service on Fargate


Hi, I want to create a composite alarm based on one of the example expressions in

Quote in the document 'ALARM(CPUUtilizationTooHigh) AND NOT ALARM(DeploymentInProgress) The expression specifies that the composite alarm goes into ALARM if CPUUtilizationTooHigh is in ALARM and DeploymentInProgress is not in ALARM. This is an example of a composite alarm that reduces alarm noise during a deployment window.'

No problem with the first alarm which uses CPUUtilization metrics. I'm stuck with the second part. How to implement 'NOT ALARMS(DeploymentInProgress)'. Which metrics can I use to create this alarm?

2 Answers

Likely you'd need "deployment system" to publish a metric DeploymentInProgress with value 1 for each minute there is a deployment. That would allow you to have an alarm called DeploymentInProgress that triggers on seeing a 1 and stays in alarm while seeing a 1.

The magic ingredient is that metric with value 1, not sure how you would get that, depends on what deployment system you have. One way that can help is writing a Lambda function that is triggered every minute by an EventBridge scheduled rule and that Lambda could query "deployment system" and log the metric you need (EMF is best way to log metric in Lambda).

answered 2 years ago

Hi, there are many ways you could implement a DeploymentInProgress alarm, here are a few suggestions:

  1. Use a dummy alarm that is always true or always false, and update it at the start and at the end of your deployment. See below an example called "AlwaysFalse" - by simply updating the threshold you can force the alarm to flip to either always true or always false within the next minute
  2. If your deployment is set at a regular cadence, use metric math to create a pseudo-metric that represents the time window of your deployment.
  3. If the script or pipeline that controls your deployment exposes an API or endpoint that you can use to detect whether a deployment is in progress or not, you could use multi-datasource querying to scrape the API or endpoint status and return it as a metric to alarm on

Example of AlwaysFalse alarm for tip 1: aws cloudwatch put-metric-alarm --alarm-name "AlwaysFalse" --actions-enabled --evaluation-periods 1 --datapoints-to-alarm 1 --threshold 1 --comparison-operator GreaterThanThreshold --metrics "[{"Id":"e1","Label":"Expression1","ReturnData":true,"Expression":"TIME_SERIES(2)","Period":300}]"

profile pictureAWS
answered 4 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions