- Più recenti
- Maggior numero di voti
- Maggior numero di commenti
The reason why your code is not running is probably because you are not deploying it using an Huggingface ECR container. Please refer to https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-model.html for more details on various deployment options in Sagemaker.
For your particular problem, I would say that the easiest way for you is to use Sagemaker Jumpstart which already has pre-built code which you can reuse for your own use case. This code has link to the right Huggingface ECR container. For e.g. below is code for HuggingFace Flan-T5 model. https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/jumpstart-foundation-models/text2text-generation-flan-t5.ipynb
You can use Sagemaker Studio--> Deployments-->Sagemaker Jumpstart-->Models, Notebook, Solutions-->Select the right model you need. Then you can deploy this Jumpstart model, chose the right instance and you will have an end-point which you can use for inference.
Contenuto pertinente
- AWS UFFICIALEAggiornata un anno fa
- AWS UFFICIALEAggiornata un anno fa
- AWS UFFICIALEAggiornata 2 anni fa
Can you check the endpoint logs to see if there is more detail? You can find the logs in the AWS console by going to SageMaker -> Endpoints (under the 'Inference' header) -> Name of your endpoint -> View logs.
I checked the logs but they arent very helpful, "2023-05-17T02:43:56,652 [INFO ] W-ehartford__WizardLM-7B-Un-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Prediction error"