My Amazon Elastic Compute Cloud (Amazon EC2) instance failed one or more status checks.
Short description
Amazon EC2 uses three status checks to monitor the health of EC2 instances:
Important: Some status check resolutions require you to stop and start your instance. A reboot keeps the instance on the same physical host. To migrate to new hardware, you must stop and start the instance.
Resolution
First, view the instance's status check metrics to identify the status check that failed. Then, take the following actions based on the status check that failed.
Configure your instance for a stop and start
Note: When you stop and start an instance, the instance's public IP address changes. It's a best practice to use an Elastic IP address to route external traffic to your instance instead of a public IP address. If you use Amazon Route 53, then you might need to update the Route 53 DNS records when the public IP address changes. A stop and start is different from an instance reboot. For more information, see How EC2 instance stop and start works.
Before you stop and start your instance, take the following actions:
Troubleshoot system status check failures
System status check failures occur when there are issues with the underlying infrastructure that your instance runs on.
To troubleshoot system status check failures, check the AWS Health Dashboard for service interruptions within the AWS Region where your instance is located.
If there are no outages, then there's an issue with the underlying host of the instance. To troubleshoot this issue, complete the following steps to migrate the instance to a new underlying host:
- Stop the instance.
Note: If the instance is stuck in the Stopping state, then force stop the instance. This action can take up to 10 minutes.
- Start the instance.
To automatically recover instances that fail system status checks, set up automatic instance recovery.
Important: A force stop doesn't flush file system caches and can cause data loss or corruption. After a restart, run the following commands based on your operating system (OS) to check the file system for consistency errors.
Linux:
fsck
Windows:
chkdsk
sfc /scannow
The chkdsk command checks the file system and file system metadata of the volume for logical and physical errors. The sfc /scannow command checks file systems for corruption. For more information, see chkdsk and Using System File Checker on the Microsoft website.
Troubleshoot instance status check failures
Instance status checks identify issues with the OS, network configuration, and resource usage. To troubleshoot instance status check failures, use the following resources based on your instance OS:
Troubleshoot attached EBS status check failures
If your instance fails the attached EBS status check, then check the VolumeStalledIOCheck metric in Amazon CloudWatch for the affected volume. If the value is 1, then the volume has issues. Typically, Amazon EBS automatically diagnoses and recovers the volume within a few minutes. For more information, see Amazon EBS I/O characteristics and monitoring.
To validate that the volume recovered, check whether that VolumeStalledIOCheck changes to a value of 0.
If the issue persists, then stop and start the instance to migrate it to a new host. If a specific volume still has issues after the restart, then you must replace the volume. Create a snapshot of the volume, and then create a new volume from the snapshot. Use the new volume to replace the existing volume.
Related information
Troubleshoot an unreachable Amazon EC2 instance