- Más nuevo
- Más votos
- Más comentarios
The reason why your code is not running is probably because you are not deploying it using an Huggingface ECR container. Please refer to https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-model.html for more details on various deployment options in Sagemaker.
For your particular problem, I would say that the easiest way for you is to use Sagemaker Jumpstart which already has pre-built code which you can reuse for your own use case. This code has link to the right Huggingface ECR container. For e.g. below is code for HuggingFace Flan-T5 model. https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/jumpstart-foundation-models/text2text-generation-flan-t5.ipynb
You can use Sagemaker Studio--> Deployments-->Sagemaker Jumpstart-->Models, Notebook, Solutions-->Select the right model you need. Then you can deploy this Jumpstart model, chose the right instance and you will have an end-point which you can use for inference.
Contenido relevante
- OFICIAL DE AWSActualizada hace un año
- OFICIAL DE AWSActualizada hace 2 años
- OFICIAL DE AWSActualizada hace un año
Can you check the endpoint logs to see if there is more detail? You can find the logs in the AWS console by going to SageMaker -> Endpoints (under the 'Inference' header) -> Name of your endpoint -> View logs.
I checked the logs but they arent very helpful, "2023-05-17T02:43:56,652 [INFO ] W-ehartford__WizardLM-7B-Un-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Prediction error"