Skip to content

Why is my target instance launched by MGN or DRS failing with a 'dracut-initqueue timeout' error?

7 minute read
Content level: Expert
0

I am experiencing boot failures with my target instance launched by AWS Application Migration Service (MGN) or AWS Elastic Disaster Recovery Service (DRS). The error message shows "Warning: dracut-initqueue timeout - starting timeout scripts".

Short description

The "dracut-initqueue timeout" error occurs when the initial RAM disk (initramfs) can't detect and mount the root filesystem during boot. This typically happens when the AWS-generated initramfs files lack necessary drivers for the target EC2 instance type, when the target instance type doesn't support the kernel version of the source server, or when the bootloader configuration contains incorrect partition references.

Cause

The AWS Replication Agent generates special initramfs files (prefixed with "aws-launch-") that are used during target instance boot. If these files don't contain the correct drivers for your target instance type, if your kernel version is incompatible with the selected instance type, or if your bootloader configuration references partitions by device names instead of persistent identifiers, the boot process fails with this error.

Resolution

Step 1: Verify and inject required drivers on the source server

Before launching any target instances, ensure the AWS-generated initramfs contains all necessary drivers.

Check if AWS initramfs was created successfully

Verify the AWS Replication Agent created the initramfs file and it has a valid size:

# For Red Hat and similar distributions:
sudo ls -la /boot/aws-launch-initramfs-$(uname -r).img

# For SUSE:
sudo ls -la /boot/aws-launch-initrd-$(uname -r)

# For Debian/Ubuntu:
sudo ls -la /boot/aws-launch-initrd.img-$(uname -r)

If the file is missing or has zero size, check available space in /boot:

sudo df -h /boot

Action required: If /boot is full, free up space by removing old kernels or unnecessary files.

Verify driver presence in AWS initramfs

Check which drivers are currently included:

# For Red Hat and similar distributions:
sudo lsinitrd /boot/aws-launch-initramfs-$(uname -r).img 2>/dev/null | grep -E "xen-blkfront|xen-netfront|ena|nvme|nvme_core"

# For SUSE:
sudo lsinitrd /boot/aws-launch-initrd-$(uname -r) 2>/dev/null | grep -E "xen-blkfront|xen-netfront|ena|nvme|nvme_core"

# For Debian/Ubuntu:
sudo lsinitramfs /boot/aws-launch-initrd.img-$(uname -r) 2>/dev/null | grep -E "xen-blkfront|xen-netfront|ena|nvme|nvme_core"

Required drivers:

  • Nitro instances (e.g. m5, c5, t3): Must have enanvmenvme_core
  • Xen instances (e.g. t2, m4, c4): Must have xen-blkfrontxen-netfront
  • SUSE 12 SP1: Needs xen_vnif and xen_vbd from xen-kmp-default package

Manually add missing drivers

If any required drivers are missing, regenerate the AWS initramfs with all drivers:

Important: Only modify AWS-generated files (with "aws-launch-" prefix).

# For Red Hat and similar distributions:
sudo dracut -v --force --add-drivers "xen-blkfront xen-netfront ext4 xfs ena nvme nvme_core lvm" /boot/aws-launch-initramfs-$(uname -r).img $(uname -r)

# For SUSE:
sudo dracut -v --force --add-drivers "xen-blkfront xen-netfront ext4 xfs ena nvme nvme_core lvm" /boot/aws-launch-initrd-$(uname -r) $(uname -r)

# For Debian/Ubuntu:
sudo dracut -v --force --add-drivers "xen-blkfront xen-netfront ext4 xfs ena nvme nvme_core lvm" /boot/aws-launch-initrd.img-$(uname -r) $(uname -r)

Confirm driver injection success

Verify the drivers were successfully added:

# Replace with appropriate initramfs path for your distribution
sudo lsinitrd /boot/aws-launch-initramfs-$(uname -r).img | grep -E "xen|ena|nvme"

Wait for replication: After making changes, wait 10-15 minutes for the updated initramfs to replicate to the staging area before launching a new instance.

Step 2: Select the appropriate instance family

If driver injection doesn't resolve the issue, the problem might be kernel-instance type incompatibility.

Quick resolution through instance type switching

Try launching with a different instance family:

  • Currently using Nitro (e.g. m5, c5, t3)? → Switch to Xen (e.g. t2.medium, m4, c4)
  • Currently using Xen (e.g. t2, m4, c4)? → Switch to Nitro (e.g. t3.medium, m5, c5)

Instance selection guidelines

Choose your instance type based on these criteria:

  1. Kernel version check:
uname -r

More details about the supported Linux systems for Nitro instances in this link.

  1. Boot mode check:
[ -d /sys/firmware/efi ] && echo "UEFI Boot Detected" || echo "Legacy BIOS Boot Detected"
  • UEFI systems: Must use Nitro instances
  • Legacy BIOS: Can use either type
  1. NVMe and ENA drivers availability:
  • Check if NVMe drivers are present in the kernel
modinfo nvme
modinfo nvme_core
modinfo ena 
  • NVMe and ENA drivers present: Can use Nitro instances
  • NVMe and ENA drivers NOT present: Must use Xen instances

Step 3: Validate bootloader configuration

The bootloader must use persistent identifiers (UUID or LABEL) instead of device names, as device naming changes between source and EC2 environments.

Identify partition reference issues

Check current GRUB configuration:

# View current kernel parameters
sudo cat /etc/default/grub | grep GRUB_CMDLINE_LINUX

# Check for device name references in grub.cfg
sudo grep -E "root=/dev/[sx]d|root=/dev/hd|resume=/dev/" /boot/grub2/grub.cfg

Problem indicators:

  • root=/dev/sda1 or root=/dev/xvda1 (uses device names)
  • resume=/dev/sda2 (swap using device name)
  • Any /dev/sd* or /dev/hd* references

Convert to persistent identifiers

  1. Collect UUID information:
# Get all partition UUIDs
sudo blkid

# Get root partition UUID specifically
sudo findmnt -n -o UUID /

# For swap partition
sudo swapon --show=NAME,UUID
  1. Update GRUB defaults:
# Backup original configuration
sudo cp /etc/default/grub /etc/default/grub.backup

# Edit the configuration
sudo vi /etc/default/grub

Replace device names with UUIDs:

# Change from:
GRUB_CMDLINE_LINUX="root=/dev/sda1 resume=/dev/sda2"

# To:
GRUB_CMDLINE_LINUX="root=UUID=abc12345-def6-7890-abcd-ef1234567890 resume=UUID=fed09876-cba5-4321-0987-654321fedcba"

Note: the grub configuration file depends on the OS version and the GRUB version.

Wait for replication: After making changes, wait 10-15 minutes for the updated grub configuration file to replicate to the staging area before launching a new instance.

Step 4: Address LVM-specific configurations

For systems using Logical Volume Manager (LVM), additional configuration validation is required.

Validate LVM setup

# Check volume groups and logical volumes
sudo vgs
sudo lvs

# Verify LVM paths
sudo lvdisplay | grep "LV Path"

# Check GRUB for LVM parameters
sudo grep -E "rd.lvm|lvm" /etc/default/grub

# Check if the LVM configuration file has filters
sudo grep -vE '^\s*#|^\s*$' /etc/lvm/lvm.conf

Fix common LVM issues

  1. Correct LVM parameter typos:
    • Ensure VG/LV names match actual configuration. For example, change "rd.lvm.lv=rhel/rooti" to "rd.lvm.lv=rhel/root" if necessary.
  2. Add LVM support to initramfs:
# Ensure LVM driver is included when regenerating
sudo dracut -v --force --add "lvm" --add-drivers "xen-blkfront xen-netfront ena nvme nvme_core" /boot/aws-launch-initramfs-$(uname -r).img $(uname -r)

Step 5: Perform offline recovery

If the instance still fails after the above steps, perform offline repair using a rescue instance.

Mount and repair the failed instance's root volume

  1. Prepare rescue environment:
  • Launch a rescue instance in the same AZ
  • Stop the failed target instance
  • Detach root volume from failed instance
  • Attach to rescue instance as secondary volume
  1. Mount the volume:

For standard partitions:

sudo mkdir /mnt/rescue
sudo mount /dev/xvdf1 /mnt/rescue  # Adjust device name

For LVM:

sudo vgchange -ay
sudo mount /dev/mapper/vg_name-lv_root /mnt/rescue
  1. Prepare chroot environment:
# Mount system directories
for m in dev proc run sys; do sudo mount -o bind {,/mnt/rescue}/$m; done

# Mount /boot if separate
sudo mount /dev/xvdf2 /mnt/rescue/boot  # If applicable

# Enter chroot
sudo chroot /mnt/rescue

Apply comprehensive fixes in chroot

  1. Update bootloader to use UUIDs:
# Get UUIDs
blkid

# Update /etc/default/grub
vi /etc/default/grub

# Regenerate GRUB
grub2-mkconfig -o /boot/grub2/grub.cfg
  1. Rebuild main system initramfs:
# For Red Hat and similar distributions:
dracut -v --force --add-drivers "xen-blkfront xen-netfront ena nvme nvme_core lvm" /boot/initramfs-$(uname -r).img $(uname -r)

# For SUSE:
dracut -v --force --add-drivers "xen-blkfront xen-netfront ena nvme nvme_core lvm" /boot/initrd-$(uname -r) $(uname -r)

# For Debian/Ubuntu:
sudo dracut -v --force --add-drivers "xen-blkfront xen-netfront ena nvme nvme_core lvm" /boot/initrd.img-$(uname -r) $(uname -r)
  1. Clean up and detach:
  • Exit chroot
exit
  • Unmount everything
sudo umount -R /mnt/rescue
  • Detach volume and reattach to original instance

Step 6: Final validation and testing

After applying fixes:

  1. For source server fixes: Wait 15 minutes for replication
  2. Launch test instance with recommended instance type
  3. Monitor console output and instance screenshot for boot progress
  4. If successful, proceed with cutover/recovery
  5. If still failing, collect serial console logs and the output of this tool (running on the source server) and contact AWS Support

Related information