Why does my EC2 Linux instance go into emergency mode when I try to boot it?

6 minute read
0

When I boot my Amazon Elastic Compute Cloud (Amazon EC2) Linux instance, the instance goes into emergency mode and the boot process fails. Then, the instance becomes inaccessible.

Short description

An instance might boot in emergency mode for the following reasons:

  • There's a corrupted kernel on the instance that cause a Kernel panic error.
  • There are auto-mount failures because of incorrect entries in the /etc/fstab file that cause Dependency failed errors.

To identify the type of error, check the instance's console output.

Resolution

Kernel panic errors

If there's an issue with the kernel, then you receive an error message similar to the following example:

"Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,1)"

Kernel panic errors occur when the grub configuration or initramfs file is corrupted. To troubleshoot this issue, complete the following steps:

  1. Revert the kernel to a previous, stable kernel.
  2. Reboot the instance.
  3. Correct the issues listed in the error message on the corrupted kernel.

Dependency failed errors

Dependency failed errors occur when syntax errors in the /etc/fstab file cause auto-mount failures. The error also occurs when the Amazon Elastic Block Store (Amazon EBS) volume listed in the file detaches from the instance. You receive an error message similar to the following example:

"[[1;33mDEPEND[0m] Dependency failed for /mnt.

[[1;33mDEPEND[0m] Dependency failed for Local File Systems.

[[1;33mDEPEND[0m]

Dependency failed for Migrate local... structure to the new structure.

[[1;33mDEPEND[0m] Dependency failed for Relabel all filesystems, if necessary.

[[1;33mDEPEND[0m] Dependency failed for Mark the need to relabel after reboot.

[[1;33mDEPEND[0m]

Dependency failed for File System Check on /dev/xvdf."

In the preceding example, the /mnt mount point failed to mount during the boot sequence. To make sure that the boot sequence doesn't enter emergency mode because of mount failures, add the following configurations to the /etc/fstab file:

  • A nofail option for the secondary partitions, such as /mnt.
    Note: The nofail option makes sure that the boot sequence isn't interrupted, even a volume or partition mount fails.
  • A 0 that turns off the file system check as the last column in the file for the mount point.

To update the /etc/fstab file, use the EC2 Serial Console, run the AWSSupport-ExecuteEC2Rescue automation, or use a rescue instance to manually edit the file.

Important: Before you stop and start your instance, take the following actions:

Note: When you stop and start an instance, the instance's public IP address changes. It's a best practice to use an Elastic IP address to route external traffic to your instance instead of a public IP address. For more information, see Stop and start Amazon EC2 instances.

Use the EC2 Serial Console
Important: You don't need to stop and start the instance when you use EC2 Serial Console.

If you activated the EC2 Serial Console for Linux, then you can use it to troubleshoot supported Nitro-based instance types and supported bare metal instances. You don't need a functional connection to connect to your instance when you use the EC2 Serial Console. Connect to the EC2 Serial Console, and then modify the /etc/fstab file.

If you haven't used the EC2 Serial Console before, then make sure that you adhere to the prerequisites. If your instance is unreachable and you haven't already configured access to the serial console, then you can't use EC2 Serial Console to correct the /etc/fstab file.

Run the AWSSupport-ExecuteEC2Rescue automation document

Prerequisites: Make sure that you have the required AWS Identity and Access Management (IAM) permissions to use AWSSupport-ExecuteEC2Rescue.

Run the AWSSupport-ExecuteEC2Rescue automation document to automatically correct boot issues. For more information, see Run the EC2Rescue tool on unreachable instances.

Use a rescue instance to manually edit the file

Complete the following steps:

  1. Open the Amazon EC2 console.

  2. Choose Instances, and then select the instance that's in emergency mode.

  3. Stop the instance.

  4. Detach the /dev/xvda or /dev/sda1 Amazon EBS root volume from the stopped instance.

  5. Launch a rescue instance in the same Availability Zone as the stopped instance.

  6. Attach the root volume to the rescue instance as a secondary device.
    Note: You can use different device names when you attach secondary volumes.

  7. Use SSH to connect to your rescue instance.

  8. To create a mount point directory for the new volume that you attached to the rescue instance, run the following command:

    sudo mkdir /mnt/rescue

    Note: Replace /mnt/rescue with your mount point directory.

  9. To identify the block device name and partition, run the following command:

    [root ~]$ lsblk

    Example output:

    NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
    xvda    202:0    0    8G  0 disk
    └─xvda1 202:1    0    8G  0 part /
    xvdf    202:80   0  101G  0 disk
    └─xvdf1 202:81   0  101G  0 part
  10. To mount the volume on the mount point directory, run the following command:

    sudo mount -o nouuid /dev/xvdf1 /mnt/rescue

    Note: Replace /dev/xvdf1 with your device name.

  11. To open the /etc/fstab file, run the following command:

    sudo vi /mnt/rescue/etc/fstab
  12. Edit the entries in /etc/fstab. The following example shows two Amazon EBS volumes defined with UUIDs. Both secondary volumes have the nofail option added, and a 0 as the last column for each entry:

    $ cat /etc/fstab
    UUID=e75a1891-3463-448b-8f59-5e3353af90ba  /  xfs  defaults,noatime  1  0
    UUID=ce917c0c-9e37-4ae9-bb21-f6e5022d5381  /mnt  ext4  defaults,noatime,nofail  1  0  
  13. Save the file, and then run the following command to unmount the volume:

    sudo umount /mnt/rescue
  14. Detach the volume from the rescue instance.

  15. Attach the volume to original instance.

  16. To confirm that you can boot the instance, start the instance.