I want to troubleshoot why I can't start AWS Systems Manager Agent (SSM Agent) on my Amazon Elastic Compute Cloud (Amazon EC2) Windows instance.
Resolution
Note: To determine whether an instance meets the necessary requirements for a managed node, it's a best practice to use ssm-cli version 3.1.501.0. For more information, see Troubleshooting managed node availability using ssm-cli.
The following reasons are why the SSM Agent might fail to initiate your Windows instance.
The IAM role doesn't have the necessary permissions
You must have the correct permissions to make APIs calls to a Systems Manager endpoint. Attach the AmazonSSMManagedInstanceCore permissions policy to the AWS Identity and Access Management (IAM) role that's associated with your instance. If you're using a custom IAM policy, then confirm that your custom policy uses the permissions in AmazonSSMManagedInstanceCore. Also, make sure that the trust policy of the IAM role allows ec2.amazonaws.com to assume this role. For more information, see Step 1: Configure instance permissions for Systems Manager.
SSM Agent can't access the instance's metadata
SSM Agent must communicate with the instance's metadata service to get the necessary information about the instance. To access your instance's metadata, use the browser within the running instance.
If you can't access the metadata, then run the route print command from Windows PowerShell or CMD. Review the output to confirm if there's a route similar to the following ones:
Persistent Routes:
Network Address Netmask Gateway Address Metric
169.254.169.254 255.255.255.255 172.31.16.1 15
169.254.169.250 255.255.255.255 172.31.16.1 15
169.254.169.251 255.255.255.255 172.31.16.1 15
169.254.169.249 255.255.255.255 172.31.16.1 15
169.254.169.123 255.255.255.255 172.31.16.1 15
169.254.169.253 255.255.255.255 172.31.16.1 15
If the route is absent or the gateway address doesn't match the current subnet, then complete one of the following steps:
-
Manually add the route:
route add -p 169.254.169.254 mask 255.255.255.255 x.x.x.x
Note: Replace x.x.x.x with your instance's gateway address.
-
If the instance has EC2Launch version 1, then run the following command from an elevated Windows PowerShell session:
import-Module c:\ProgramData\Amazon\EC2-Windows\Launch\Module\Ec2Launch.psm1 ; Add-Routes
-
If the instance has EC2Launch version 2 or EC2Config, then restart the services. This sets the non-persistent static routes to reach the metadata service and AWS KMS servers.
If the route exists but the instance can't retrieve metadata, then review your instance's Windows Defender Firewall, third-party firewall, or antivirus configuration. Confirm that the traffic to 169.254.169.254 isn't explicitly denied. For more information, see Why does my Amazon EC2 Windows instance generate a "Waiting for the metadata service" error?
The instance isn't connected to the SSM endpoint
Verify that the instance has connectivity to Systems Manager endpoints on port 443. Run the following Windows PowerShell commands to verify connectivity. Replace RegionID with the AWS Region where the instance is located:
- Test-NetConnection ssm.RegionID.amazonaws.com -port 443
- Test-NetConnection ec2messages.RegionID.amazonaws.com -port 443
- Test-NetConnection ssmmessages.RegionID.amazonaws.com -port 443
Public subnets
Systems Manager endpoints are public endpoints. This means that your instance must use an internet gateway to reach the internet. If you experience issues when you connect to the endpoints from instances in a public subnet, then confirm the following configurations:
- The route table that your instance uses must contain a route to the internet.
- Your Amazon Virtual Private Cloud (Amazon VPC) security groups and network access control lists (network ACLs) must allow outbound connections on port 443.
Private subnets
For private subnets, your instance must use a NAT gateway to reach the internet. Or, configure the Amazon VPC endpoints to reach Systems Manager endpoints. This allows you to use private IP addresses to access Amazon EC2 and Systems Manager APIs. For more information, see Why is my EC2 instance not displaying as a managed node or showing a "Connection lost" status in Systems Manager?
The proxy settings aren't applied
If you use proxies, then the proxy settings are evaluated and applied to the agent configuration when you start SSM Agent. For more information, see Configure SSM Agent to use a proxy for Windows Server instances.
The SSM Agent isn't the latest version
It's a best practice to download and manually install the latest SSM Agent version. To check the information on the latest SSM Agent versions, see the Amazon SSM Agent releases on the GitHub website.
If the instance fails to start SSM Agent after you complete the preceding steps, then view the SSM Agent logs to troubleshoot further.