Failed to start the instance id-***** Insufficient capacity

0

Sometimes when I try to start a g5 (On-Demand) instance I get this message: "Failed to start the instance * Insufficient capacity". My g5 (On-Demand) quota is 128 and even when I have no machine on start state I am getting that message.

ccoelho
asked 7 months ago261 views
3 Answers
2

Hi,

This means that AWS doesn't currently have enough available On-Demand capacity to complete your request. To resolve the issue, try the following:

  • Wait a few minutes and then submit your request again.
  • Submit a new request with a reduced number of instances.
  • If you're launching an instance, submit a new request without specifying an Availability Zone.
  • If you're launching an instance, submit a new request using a different instance type. You can resize at a later stage, if needed.
  • Launching instances into a cluster placement group might cause an insufficient capacity error. For more information, see Working with placement groups.

More information here.

profile picture
EXPERT
answered 7 months ago
0

The g5 type instance is a GPU instance and requires vCPU quota.

AWS provides a default quota of 0 vCPUs for all G instances. This means that you can't create any G instances at all.

According to the official document, you need at least 4 vcpu to start the instance. I suggest you check your vCPU quota first.

Check service quota

It is useful to find out the current quota before requesting an increase. To do that, follow these steps:

  1. Log into your AWS console.
  2. Select the region you want to use. For example, us-east-1 (N. Virginia).
  3. Search for Service Quotas.
  4. Click AWS Services in the left sidebar.
  5. Search for Amazon Elastic Compute Cloud (Amazon EC2).
  6. Find the entry for Running On-Demand G and VT instances.

From the table above, it is clear that the default quota value is 0 vCPUs.

To request further quota increase, select the entry for Running On-Demand G and VT instances, and click Request quota increase.

Enter in the number of vCPUs you want to use. Let's say you want 2 Small GPU servers, you need request for 8 vCPUs.

Simply click on the Request button, and AWS will follow up with the request.

AWS will get back to you on the request, and increase the quota if there are no issues. There are similar steps to follow to increase the quota on other cloud services as well.

answered 7 months ago
  • Sorry if I was not clear. We already have a quota (128). And we already is using some g5 instance machines to training llm models. But still sometimes, even with no machines in" start state" this error appears.

  • My fault, I didn't read the description clearly. Have you considered that this is caused by the insufficient number of machines in the AWS region or availability zone? I think reserved instances may help your problem.

  • sorry but when you say "reserved" you mean "dedicated instances"

  • No, you said that we already is using some g5 instance machines to training llm models. . I don’t know how much your needs are. I'm just giving my thoughts, but if you use it a lot, I think reserving instances could solve your problem while charging less.

0

You can also take a look at using the "On-Demand Capacity Reservations" service. By creating Capacity Reservations, you ensure that you always have access to EC2 capacity when you need it, for as long as you need it. You can create Capacity Reservations at any time, without entering into a one-year or three-year term commitment, and the capacity is available immediately. When you no longer need it, cancel the Capacity Reservation to stop incurring charges. Please note that it is not a workaround if you’re actually hitting an Insufficient Capacity Error. Capacity reservations are useful when designing customer’s architecture in the beginning, so that you proactively reserve spots and never hit an Insufficient Capacity Error. For more information on this, please review our reference, link, here: - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-capacity-reservations.html

AWS
SUPPORT ENGINEER
answered 7 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions