How do I address the degradation of underlying hardware that hosts my EC2 instance?

3 minute read
0

I received a notice that there’s degradation of the underlying hardware that hosts my Amazon Elastic Compute Cloud (Amazon EC2) instance.

Short description

If a hardware malfunction occurs, then Amazon EC2 tags the specific hardware as faulty. Any instances that run on the hypervisor of the faulty hardware must move to healthy hardware. For this transition, Amazon EC2 stops the instances that are backed by Amazon Elastic Block Store (Amazon EBS) and terminates instance store-backed instances. Amazon EC2 also sends a notification in your email and AWS Health Dashboard about the hardware degradation and upcoming instance stop or termination. You can also manually stop or terminate the instance to begin the transition sooner. After you stop or terminate the instance, you must start it to transition it to healthy underlying hardware.

Note: For instances that launched from an Amazon EC2 Auto Scaling group, the instance termination and replacement occur immediately. Amazon EC2 Auto Scaling automatically replaces instances as soon as they have a future scheduled maintenance or retirement event. By the time you receive a hardware degradation notification, you can't see the original instance in your dashboard. To view the termination event, review the AWS CloudTrail log file for that instance.

Resolution

Manually stop and start your instance with the Amazon EC2 console or AWS Command Line Interface (AWS CLI). When you stop the instance, this removes it from the faulty hardware. When you start it again, this launches it on healthy hardware.

Note: If you receive errors when you run AWS CLI commands, see Troubleshoot AWS CLI errors. Also, make sure that you're using the most recent AWS CLI version.

Stop and start the instance

Note: A stop and start isn't equivalent to a reboot. A start is required to migrate the instance to healthy hardware.

Before you proceed, note the following conditions for stopping and starting an instance:

  • This procedure requires a stop and start of your EC2 instance. Instance store data is lost when an instance is stopped and then started again. If your instance is instance store-backed or has instance store volumes that contain data, then you lose the data when the instance stops. For more information, see Determining the root device type of your instance.
  • Stopping and starting the instance changes the public IP address of your instance. It's a best practice to use an Elastic IP address instead of a public IP address when routing external traffic to your instance.

To stop and start the instance, complete the following steps:

  1. Open the Amazon EC2 console and then select the instance.
  2. Select Actions, Instance State, Stop.
  3. Select Yes, Stop.
    Note: If your instance is stuck in the stopping state, you might need to force the instance to stop. For more information on stopping an instance stuck in the stopping state, see Troubleshooting stopping your instance.
  4. Select the instance again.
  5. Select Actions, Instance State, Start.
  6. Select Yes, Start.

Note: The hardware degradation notification remains in your AWS Health Dashboard with a status of Completed until the stop or terminate date listed in the notification.

(Optional) Set up instance recovery for your instances

You can create an Amazon CloudWatch alarm that automatically recovers instances that have underlying hardware degradation. For information on how to set up the CloudWatch alarm, see Recover your instance.

AWS OFFICIAL
AWS OFFICIALUpdated 6 months ago