Trying to build out alarm automation and running into snag on storage alarms.

0

I have the Cloud Watch Agent installed on an ec2 instance for testing. Here is my config:

{
    "agent": {
        "metrics_collection_interval": 60,
        "run_as_user": "cwagent"
    },
    "metrics": {
        "append_dimensions": {
            "AutoScalingGroupName": "${aws:AutoScalingGroupName}",
            "ImageId": "${aws:ImageId}",
            "InstanceId": "${aws:InstanceId}",
            "InstanceType": "${aws:InstanceType}"
        },
        "aggregation_dimensions" : [["InstanceId","path"]],
        "metrics_collected": {
            "disk": {
                "measurement": [
                    "used_percent"
                ],
                "metrics_collection_interval": 60,
                "resources": [
                    "*"
                ],
                "ignore_file_system_types": [
                    "sysfs", "devtmpfs", "tmpfs", "overlay", "debugfs", "squashfs", "iso9660", "proc", "autofs", "tracefs"
                ],
                "drop_device": true
            },
            "mem": {
                "measurement": [
                    "mem_used_percent"
                ],
                "metrics_collection_interval": 60
            }
        }
    }
}

I'm trying to get the cloudwatch agent to send disk stats (which it is) but when I try to create an alarm with my lambda function, the created alarms don't receive any disk stats.

My function project is in type script and the create storage alarm function has two dependency functions—one to filter through tags to set alarm property values and another to create or update existing alarms. I won't include the tag filtering functions because that function works as expected and does not create the alarm. I will however provide the storage alarm function (manageStorageAlarmForInstance) and the function that takes passed parameters from the storage alarm function and actually creates those alarms.

//function to create or update alarms: 
async function createOrUpdateAlarm(
  alarmName: string,
  instanceId: string,
  props: AlarmProps
) {
  try {
    await cloudWatchClient.send(
      new PutMetricAlarmCommand({
        AlarmName: alarmName,
        ComparisonOperator: 'GreaterThanThreshold',
        EvaluationPeriods: props.evaluationPeriods,
        MetricName: props.metricName,
        Namespace: props.namespace,
        Period: props.period,
        Statistic: 'Average',
        Threshold: props.threshold,
        ActionsEnabled: false,
        Dimensions: props.dimensions,
      })
    );
    log
      .info()
      .str('alarmName', alarmName)
      .str('instanceId', instanceId)
      .num('threshold', props.threshold)
      .num('period', props.period)
      .num('evaluationPeriods', props.evaluationPeriods)
      .msg('Alarm configured');
  } catch (e) {
    log
      .error()
      .err(e)
      .str('alarmName', alarmName)
      .str('instanceId', instanceId)
      .msg('Failed to create or update alarm due to an error');
  }
} 

//function to create storage monitoring alarms: 
async function manageStorageAlarmForInstance(
  instanceId: string,
  instanceType: string,
  imageId: string,
  tags: Tag,
  type: AlarmClassification
): Promise<void> {
  const baseAlarmName = `autoAlarm-EC2-${instanceId}-${type}StorageUtilization`;
  const thresholdKey = `autoalarm:storage-free-percent-${type.toLowerCase()}`;
  const durationTimeKey = 'autoalarm:storage-percent-duration-time';
  const durationPeriodsKey = 'autoalarm:storage-percent-duration-periods';
  const defaultThreshold = type === 'Critical' ? 10 : 20;

  const alarmProps: AlarmProps = {
    threshold: defaultThreshold,
    period: 60,
    namespace: 'disk',
    evaluationPeriods: 5,
    metricName: 'used_percent',
    dimensions: [
      {Name: 'InstanceId', Value: instanceId},
      {Name: 'ImageId', Value: imageId},
      {Name: 'InstanceType', Value: instanceType},
      {Name: 'Path', Value: '/'},
    ],
  };

  try {
    configureAlarmPropsFromTags(
      alarmProps,
      tags,
      thresholdKey,
      durationTimeKey,
      durationPeriodsKey
    );
  } catch (e) {
    log.error().err(e).msg('Error configuring alarm props from tags');
    throw new Error('Error configuring alarm props from tags');
  }
//checks to see if alarm exists
  const alarmExists = await doesAlarmExist(baseAlarmName);
  if (
    !alarmExists ||
    (alarmExists && (await needsUpdate(baseAlarmName, alarmProps))) //needsUpdate just compares the alarm props against the current alarm values 
  ) {
    await createOrUpdateAlarm(baseAlarmName, instanceId, alarmProps);
    log
      .info()
      .str('alarmName', baseAlarmName)
      .str('instanceId', instanceId)
      .msg('Storage usage alarm configured or updated.');
  } else {
    log
      .info()
      .str('alarmName', baseAlarmName)
      .str('instanceId', instanceId)
      .msg('Storage usage alarm is already up-to-date');
  }
}

Any ideas on what's wrong with the way I'm creating my storage alarms? Why wont those alarms receive data from the cloud watch agent?

Thanks in advance for the assist.

asked 13 days ago142 views
2 Answers
0

Hello.

I think it is necessary to first check whether metrics are being acquired, not Alarm.
Are the target CloudWatch metrics being output?
If the metrics exist, there may be a problem with your Lambda code.

profile picture
EXPERT
answered 13 days ago
profile picture
EXPERT
reviewed 13 days ago
  • You are correct, the metrics do exist. If I go into cloudwatch and look at creating an alarm from scratch, I can see all the mount points individually for the the ec2 instance with the cloudwatch agent installed reporting storage stats for each mount point respectively. Just trying to figure out how to configure my function in my lamba correctly to do the same.

0

Have you looked at this blog...solution ready made....just read instructions very carefully! https://aws.amazon.com/blogs/mt/use-tags-to-create-and-maintain-amazon-cloudwatch-alarms-for-amazon-ec2-instances-part-1/

njoylif
answered 7 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions