Describe the issue
import:
import sagemaker
from sagemaker.huggingface.model import HuggingFaceModel
code:
sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()
hub = {
'HF_MODEL_ID' : "llava-hf/llava-1.5-7b-hf" , # model_id from hf.co/models
'HF_TASK' : 'image-to-text' # NLP task you want to use for predictions
}
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
env=hub, # configuration for loading model from Hub
role=role, # IAM role with permissions to create an endpoint
transformers_version="4.26", # Transformers version used
pytorch_version="1.13", # PyTorch version used
py_version='py39', # Python version used
)
deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.m5.4xlarge"
)
request:
data_url = {'inputs':'https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg'}
predictor .predict(data_url)
error:
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from model with message "{
"code": 400,
"type": "InternalServerException",
"message": "\u0027llava\u0027"
}
". See https://ap-southeast-1.console.aws.amazon.com/cloudwatch/home?region=ap-southeast-1#logEventViewer:group=/aws/sagemaker/Endpoints/huggingface-pytorch-inference-2024-03-16-14-04-39-322 in account 601804951188 for more information.
#Attempted Solutions
Consulted
https://github.com/sungeuns/gen-ai-sagemaker/blob/main/MultiModal/02-llava-sagemaker-endpoint.ipynb
https://github.com/haotian-liu/LLaVA/issues/600
https://github.com/haotian-liu/LLaVA/issues/907