Unable to use a GPU instance in Sagemaker

0

Hey everyone, I'm trying to train a model on sagemaker and wanted to use a GPU instance with it. I can confirm that I am not on free tier, I am not able to see the P3 or P5 instances in the list. Also in my Image selection I am not able to find any tensorflow image which can be used. Any help would be appreciated. I also tried using a G5 instance but this is the error I get when I try to run the Notebook instance after selecting it (Third Image).

Instances Images I see Error I receive

Wahaj11
질문됨 6달 전1162회 조회
2개 답변
1

There is a couple of things to check here:

G/P instance quota availability

Certain instances like G5, P4 needs a quota increase in AWS console as they're not enabled by default in your account. Probably the error show is refered to something like this:

ResourceLimitExceeded: The account-level service limit 'Studio KernelGateway Apps running on ml.g5.xlarge instance' is 0 Apps, with current utilization of 0 Apps and a request delta of 1 Apps. Please use AWS Service Quotas to request an increase for this quota. If AWS Service Quotas is not available, contact AWS support to request an increase for this quota

You can check your EC2 quotas under Service Quotas in AWS console by searching for instance families like G (Filter: "Running On-Demand G") or P (Filter: "Running On-Demand P").

EC2 Instance type available in region

On the other hand, if instance type doesn't appear in list probably means that is not available in chosen region where you have deployed SageMaker Studio. You can take a look to available instances per region for On-Demand Plans for Amazon EC2 and verify that instance type.

*If you find this useful and solves your question, please remember to accept anwer.

AWS
avelizf
답변함 6달 전
  • This response was helpful: increasing the quota enabled me to run a g4dn.xlarge instance from within EC2.

    Unfortunately, SageMaker still gives the same error message: "Unable to complete operation. Please try again."

    I checked the execution role and attached AmazonEC2FullAccess and AmazonSageMakerFullAccess permissions without success.

    Note that the SageMaker instance is accessed via Identity Center.

    I spent a lot of time on this already and run out of ideas. Can you think of another thing?

0

If you're looking for the g4dn.xlarge instance, the quota you want to increase is actually for Sagemaker, not for EC2. Studio JupyterLab Apps running on ml.g4dn.xlarge instances

rubi242
답변함 4달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠