Skip to content

How to extend EC2 CloudWatch custom metrics and dashboards to Auto Scaling Group so new instances automatically install agent, push CPU/memory/disk metrics, and update dashboards ?

1

I have created a terraform script that launches a new EC2 instance, installs the CloudWatch agent, pushes CPU, memory, and disk metrics, and also creates a custom dashboard for these metrics. Now, I would like to extend this setup to the Auto Scaling Group (ASG) level. Specifically, whenever the load increases on the current instance, the ASG should automatically launch a new instance. On every newly launched instance, the CloudWatch agent should be installed automatically to push the custom metrics, and the custom dashboard should also be created.

asked 8 months ago203 views
4 Answers
0

Bake the CloudWatch Agent into the Launch Template

ASGs don’t run “user data” every time in the same way you did for a single instance, so the Launch Template (or Launch Configuration) is where you install the agent.

In Terraform, update your launch template’s user_data to include:

#!/bin/bash
yum install -y amazon-cloudwatch-agent
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
  -a fetch-config -m ec2 \
  -c ssm:AmazonCloudWatch-linux \
  -s

The -c ssm:AmazonCloudWatch-linux pulls the config from Systems Manager (SSM) Parameter Store — so you don’t hardcode JSON in user data. That way, all instances in the ASG stay consistent.

  1. Store the CloudWatch Agent Config in SSM

Put your amazon-cloudwatch-agent.json into SSM Parameter Store (or use the provided AWS managed one).

Terraform example:

resource "aws_ssm_parameter" "cw_agent_config" {
  name  = "AmazonCloudWatch-linux"
  type  = "String"
  value = file("cw-agent-config.json")
}

This way, when the ASG spins up new instances, each pulls the same config at boot.

  1. IAM Role for Instances Your ASG instances must have an Instance Profile with permissions for: CloudWatchAgentServerPolicy

AmazonSSMManagedInstanceCore (so they can fetch the config and push metrics).

  1. Dashboards

Dashboards don’t auto-replicate per instance. Instead, design them to use wildcard dimensions or reference the ASG name rather than a specific instance ID.

Example:


{
  "metrics": [
    [ "CWAgent", "mem_used_percent", "AutoScalingGroupName", "my-asg" ]
  ]
}

That way, when new instances join the group, the dashboard automatically updates to include them.

  1. Terraform Wiring

Launch Template + ASG:

resource "aws_launch_template" "example" {
  name_prefix   = "asg-cw-"
  image_id      = data.aws_ami.amazon_linux.id
  instance_type = "t3.micro"
  user_data     = base64encode(file("user_data.sh"))
  iam_instance_profile {
    name = aws_iam_instance_profile.asg_profile.name
  }
}

resource "aws_autoscaling_group" "example" {
  desired_capacity     = 2
  max_size             = 5
  min_size             = 1
  launch_template {
    id      = aws_launch_template.example.id
    version = "$Latest"
  }
  vpc_zone_identifier = [subnet-123333, subnet67687688]
}

Dashboard: use Terraform aws_cloudwatch_dashboard with ASG-based metrics.

Result: Every new instance launched by the ASG auto-installs the CloudWatch agent, pushes CPU/memory/disk metrics, and your dashboards keep showing them without you having to update anything manually.

answered 8 months ago
  • tried but not working can u share email id so i can share code file with you

  • Tried not working

0

Sure, please share the module snippet here

answered 8 months ago
  • resource "aws_autoscaling_group" "web" { name="web-asg" vpc_zone_identifier=["subnet"] min_size=1 max_size=3 desired_capacity=1 launch_template { id=aws_launch_template.web.id version="$Latest" } } resource "aws_autoscaling_policy" "up" { name="scale-up" scaling_adjustment=1 adjustment_type="ChangeInCapacity" cooldown=300 autoscaling_group_name=aws_autoscaling_group.web.name } resource "aws_autoscaling_policy" "down" { name="scale-down" scaling_adjustment=-1 adjustment_type="ChangeInCapacity" cooldown=300 autoscaling_group_name=aws_autoscaling_group.web.name } resource "aws_cloudwatch_metric_alarm" "cpu_high" { alarm_name="HighCPU" metric_name="CPUUtilization" namespace="AWS/EC2" comparison_operator="GreaterThanThreshold" threshold=70 period=60 statistic="Average" evaluation_periods=2 alarm_actions=[aws_autoscaling_policy.up.arn] dimensions={AutoScalingGroupName=aws_autoscaling_group.web.name} } resource "aws_cloudwatch_metric_alarm" "cpu_low" { alarm_name="LowCPU" metric_name="CPUUtilization" namespace="AWS/EC2" comparison_operator="LessThanThreshold" threshold=20 period=60 statistic="Average" evaluation_periods=2 alarm_actions=[aws_autoscaling_policy.down.arn] dimensions={AutoScalingGroupName=aws_autoscaling_group.web.name} } resource "aws_cloudwatch_dashboard" "main" { dashboard_name="asg-dash" dashboard_body=jsonencode({ widgets=[{ type="metric",properties={ metrics=[["AWS/EC2","CPUUtilization","AutoScalingGroupName

  • Not working kindly give me proper solution

0

How to Extend Properly

  • Create an SSM Parameter with the CloudWatch Agent config (includes CPU, memory, disk, with ASG dimension).
  • Attach CloudWatchAgentServerPolicy + SSMManagedInstanceCore to the ASG’s EC2 role.
  • Update Launch Template with user_data that installs and starts the agent on boot, pulling config from SSM.
  • Update Dashboard to point to CWAgent namespace metrics (cpu_usage_active, mem_used_percent, disk_used_percent) with AutoScalingGroupName.
  • (Optional) Autoscaling Policies, instead of raw EC2 CPU, you can scale on memory/disk because you’ll now have those metrics.

=======================================================================

SSM Parameter for CW Agent config

resource "aws_ssm_parameter" "cw_agent_config" { name = "/my-asg/cwagent-config" type = "String" value = <<EOT { "metrics": { "append_dimensions": { "AutoScalingGroupName": "${aws:AutoScalingGroupName}" }, "metrics_collected": { "cpu": { "measurement": ["cpu_usage_active"] }, "mem": { "measurement": ["mem_used_percent"] }, "disk": { "measurement": ["used_percent"], "resources": ["*"] } } } } EOT }

IAM Role + Policies

resource "aws_iam_role" "ec2_role" { name = "asg-ec2-role" assume_role_policy = jsonencode({ Version = "2012-10-17", Statement = [{ Effect = "Allow", Action = "sts:AssumeRole", Principal = { Service = "ec2.amazonaws.com" } }] }) }

resource "aws_iam_role_policy_attachment" "cw_agent" { role = aws_iam_role.ec2_role.name policy_arn = "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy" }

resource "aws_iam_role_policy_attachment" "ssm" { role = aws_iam_role.ec2_role.name policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore" }

resource "aws_iam_instance_profile" "ec2_profile" { name = "asg-ec2-profile" role = aws_iam_role.ec2_role.name }

Launch Template with UserData to install CW Agent

resource "aws_launch_template" "web" { name_prefix = "web-lt" image_id = "ami-33333" instance_type = "t3.micro" iam_instance_profile { name = aws_iam_instance_profile.ec2_profile.name } user_data = base64encode(<<EOF #!/bin/bash yum install -y amazon-cloudwatch-agent /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl
-a fetch-config -m ec2 -c ssm:${aws_ssm_parameter.cw_agent_config.name} -s EOF ) }

Auto Scaling Group

resource "aws_autoscaling_group" "web" { name = "web-asg" vpc_zone_identifier = ["subnet-123333"] min_size = 1 max_size = 3 desired_capacity = 1

launch_template { id = aws_launch_template.web.id version = "$Latest" } }

**sample ** resource "aws_cloudwatch_dashboard" "asg_dashboard" { dashboard_name = "asg-dash" dashboard_body = jsonencode({ widgets = [ { type = "metric", properties = { metrics = [ [ "CWAgent", "cpu_usage_active", "AutoScalingGroupName", aws_autoscaling_group.web.name ] ], title = "ASG CPU Usage (%)" } }, { type = "metric", properties = { metrics = [ [ "CWAgent", "mem_used_percent", "AutoScalingGroupName", aws_autoscaling_group.web.name ] ], title = "ASG Memory Usage (%)" } }, { type = "metric", properties = { metrics = [ [ "CWAgent", "used_percent", "AutoScalingGroupName", aws_autoscaling_group.web.name ] ], title = "ASG Disk Usage (%)" } } ] }) }

answered 8 months ago
0

Short Answer

You don’t extend this per instance — you make it part of the launch template + dynamic dashboarding.

  • Install & configure the CloudWatch Agent via Launch Template (user data)
  • Send metrics with dimensions like AutoScalingGroupName
  • Build dashboards that use ASG-level aggregation, not per-instance hardcoding

Correct Architecture

1. Auto-install CloudWatch Agent (Launch Template)

Put your setup into user data so every new instance configures itself:

#!/bin/bash
yum update -y

# Install CloudWatch Agent
yum install -y amazon-cloudwatch-agent

# Write config
cat <<EOF > /opt/aws/amazon-cloudwatch-agent/bin/config.json
{
  "metrics": {
    "namespace": "Custom/EC2",
    "append_dimensions": {
      "AutoScalingGroupName": "\${aws:AutoScalingGroupName}",
      "InstanceId": "\${aws:InstanceId}"
    },
    "metrics_collected": {
      "mem": {
        "measurement": ["mem_used_percent"]
      },
      "disk": {
        "measurement": ["used_percent"],
        "resources": ["*"]
      }
    }
  }
}
EOF

# Start agent
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
-a fetch-config -m ec2 \
-c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s

Attach this Launch Template to your ASG ✔️


2. IAM Role (Critical)

Ensure your EC2 instances have:

AmazonSSMManagedInstanceCore
CloudWatchAgentServerPolicy

Without this, metrics won’t publish.


3. Use ASG-Level Metrics in Dashboard

❗ Don’t create dashboards per instance — that won’t scale.

Instead, define metrics like:

{
  "metrics": [
    [ "Custom/EC2", "mem_used_percent", "AutoScalingGroupName", "my-asg" ]
  ],
  "stat": "Average"
}

This automatically includes:

  • All current instances
  • Any future instances

4. (Optional) Dynamic Dashboards

If you want more flexibility:

  • Use CloudWatch Metric Math:

    • SEARCH('{Custom/EC2,AutoScalingGroupName} MetricName="mem_used_percent"', 'Average', 300)
  • Or generate dashboards via Terraform using ASG name as variable


Common Mistake (What You Were Doing)

Creating dashboards during instance provisioning

This breaks with ASG because:

  • Instances are ephemeral
  • You’ll end up with duplicate or stale dashboards

Professional Takeaway 💡

You’re moving from instance-centric → fleet-centric thinking, which is key in AWS.

The upgrade in mindset:

  • Bake configuration into infrastructure (Launch Template)
  • Use dimensions + aggregation, not static resources
  • Design everything assuming instances are disposable

That shift is what makes systems truly scalable. ✔️


Confidence

Very high (95%) — This is the standard AWS pattern for ASG + CloudWatch integration.

answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.