Skip to content

Why is my target instance not launching with AWS MGN or AWS DRS?

7 minute read
Content level: Expert
0

I am attempting to launch a target instance using AWS MGN or AWS DRS, but the instance is not launching correctly.

Short description

When launching a target instance with AWS Application Migration Service (AWS MGN) or AWS Elastic Disaster Recovery (AWS DRS), there can be various issues that prevent the instance from launching correctly. Common causes include incorrect or incompatible network configurations in the launch template, or insufficient permissions for the service to perform the necessary actions.

Resolution

Review the following common causes and troubleshooting steps to resolve issues with launching a target instance using Application Migration Service or Elastic Disaster Recovery.

VPC-related errors

One of the most common issues occurs when the network configurations in your launch template are incorrect or incompatible. To resolve these issues:

  1. Verify that the VPC, subnet, and security group specified in the launch template are correct and exist in the target AWS account and region.
  2. Check if the specified subnet's IP address range is compatible with the private IP address assigned to the instance. If the "Copy Private IP" setting is enabled in the launch template, it may cause an IP address conflict if the source server's IP address is not within the target subnet's range. In such cases, disable this setting.
  3. If any changes are made to the launch template, create a new version and set it as the default version before attempting to launch the instance again.

Here are some examples of VPC-related errors you may encounter in CloudTrail logs and how to resolve them:

IP Address Already in Use

Error: "An error occurred (InvalidIPAddress.InUse) when calling the RunInstances operation: Address 10.213.10.101 is in use."

This error occurs when the IP address you're trying to use is already allocated. To resolve:

  • Check for an existing instance or Elastic Network Interface (ENI) using this IP address by running the following AWS CLI command:
aws ec2 describe-network-interfaces \
--filters Name=private-ip-address,Values=<IP-ADDRESS> \
--query 'NetworkInterfaces[0].NetworkInterfaceId' \
--output text
  • Either remove the static IP from the launch template or ensure the IP address is freed before launching the target instance.

IP Address Outside Subnet Range

Error: "An error occurred (InvalidParameterValue) when calling the RunInstances operation: Address 10.10.2.4 does not fall within the subnet's address range."

This typically occurs when:

  • "Copy source IP" is enabled in the launch settings.
  • The launch template contains a static IP outside the subnet range.

Solution: Disable "Copy source IP" and remove the static IP configuration from the launch template.

Missing VPC Configuration

Error: "An error occurred (VPCIdNotSpecified) when calling the RunInstances operation: No default VPC for this user. GroupName is only supported for EC2-Classic and default VPC."

This error appears when:

  • The default VPC has been removed.
  • A default subnet/VPC is not selected in the EC2 launch template.
  • An incorrect target subnet is specified in the EC2 launch template.
  • The EC2 launch template with the correct subnet settings is not set as the default.

Solution: Configure a specific VPC and subnet in the launch template.

Invalid Security Group

Error: "An error occurred (InvalidGroup.NotFound) when calling the RunInstances operation: The security group '<SG-ID>' does not exist in VPC '<VPC-ID>'."

Solution: Verify that the security group exists and belongs to the correct VPC.

Permission and Authorization Errors

Permission-related issues are another frequent cause of launch failures. Here's how to address them:

  1. Review the CloudTrail logs for any "UnauthorizedOperation" or "AccessDenied" errors related to these actions below or other required actions:

    • ec2:RunInstances
    • ec2:CreateSnapshot
    • ec2:CreateVolume
    • ec2:StartInstances
    • ec2:TerminateInstances
    • ec2:StopInstances
    • ec2:AttachVolume
    • ec2:DetachVolume
    • ec2:RegisterImage
    • ec2:DeregisterImage
    • ec2:DescribeImages
    • ec2:CreateTags
  2. If you encounter an authorization error, decode it using the AWS CLI command:

aws sts decode-authorization-message \
--encoded-message [ENCODED_MESSAGE] \
--query DecodedMessage \
--output text | jq '.'
  1. If the error message mentions an explicit deny due to an SCP, identify the SCP that is causing the issue and work with your organization's administrators to update it to allow the necessary actions. Below an example of error that you may get:

Error: "You are not authorized to perform this operation. User: arn:aws:sts::ACCOUNT_ID:assumed-role/ROLE_NAME/user@domain.com is not authorized to perform: ec2:RunInstances on resource: arn:aws:ec2:REGION:ACCOUNT:instance/* with an explicit deny in a service control policy"

  1. Verify that the IAM role or user starting the launch of the target instance has the required permissions, including:

    • ec2:RunInstances
    • ec2:StartInstances
    • ec2:StopInstances
    • ec2:TerminateInstances
    • ec2:CreateVolume
    • ec2:AttachVolume
    • ec2:DetachVolume
    • ec2:DeleteVolume
    • ec2:CreateSnapshot
  2. If using a customer-managed AWS KMS key for EBS encryption, ensure the key policy includes the following permissions for IAM user or role starting the launch of the target instance (e.g., kms:CreateGrant, kms:DescribeKey, kms:Encrypt, kms:Decrypt, kms:GenerateDataKey, kms:GenerateDataKeyWithoutPlaintext).

Here below an example of error :

Error: User: arn:aws:sts::ACCOUNT_ID:assumed-role/ROLE_NAME/user@domain.com is not authorized to perform: kms:CreateGrant on resource: arn:aws:kms:REGION:ACCOUNT_ID:key/<KEY_ID> because no resource-based policy allows the kms:CreateGrant action

Issues with the source server snapshot or replication process

Application Migration Service or Elastic Disaster Recovery may sometimes fail to take snapshots of EBS volumes corresponding to source server disks. Follow these troubleshooting steps:

  1. Check CloudTrail logs for CreateSnapshot or DescribeSnapshots API call failures when encountering snapshot update errors.

  2. For SNAPSHOTS_FAILURE errors, check:

    • IAM permissions - Verify required permissions are attached to appropriate roles
    • API throttling - Check if you have activated throttling. If throttling is not activated, check your CloudTrail logs for throttling errors.
  3. Verify connectivity between source and replication servers:

    • Ensure the AWS Replication Agent is running on the source server and able to establish SSL connections with regional MGN/DRS API endpoint and it is connected to the replication server.
    • If needed, use VPC traffic mirroring for packet capture analysis and identify any entities blocking replication
  4. Address replication lag or backlog issues:

  5. Investigate stuck snapshots using the troubleshooting guide.

  6. Verify endpoint accessibility:

    • Ensure EC2, MGN (or DRS), and S3 endpoints are reachable from the staging area on port 443
    • If you are using VPC endpoints to have connectivity to these endpoints, make sure that the security groups of these endpoints are allowing inbound connections on port 443.
    • Confirm no SSL interception is present
    • Reference Application Migration Service and Elastic Disaster Recovery documentation for detailed requirements

For comprehensive troubleshooting guidance, consult:

Incompatible instance or volume types

Certain instance families or volume types may not be supported in specific AWS regions or Availability Zones. For example, Local Zones only support gp2 SSD volumes and not gp3.

  1. Review the instance type and volume type specified in the launch template.
  2. Check the AWS documentation for any compatibility restrictions or limitations in the target region or Availability Zone.
  3. Update the launch template with a compatible instance type and volume type supported in the target environment.

Resource limits

If you have reached the limit for the number of instances, volumes, or other resources needed for the launch of the target instance in the target AWS account or region, new instance launches will fail.

  • Service quotas exceeded (security groups, IP addresses..)
  • Insufficient instance capacity in target AZ
  • Multiple security groups exceeding limits

In such a situation you need to:

  1. Check the relevant service quotas and limits in the target account and region using the Service Quotas console.
  2. If you have reached a limit, request a quota increase through the Service Quotas console or by contacting AWS Support.

Other Common Issues

  • API throttling
  • Conversion server is unable to communicate with the necessary AWS Endpoints for staging area communication.
  • AMI referenced in the launch template has been deregistered or is private and doesn't belong to the account where the instance launch is occurring

For all these issues, it's recommended to:

  1. Check CloudTrail logs for detailed error messages
  2. Verify resource availability and quotas
  3. Review network and security configurations
  4. Ensure proper permissions and policies are in place

Related information

AWS
SUPPORT ENGINEER
published 8 months ago481 views