how does inference work in multi model endpoint in sagemaker?

0

based on the docs provided here, https://docs.aws.amazon.com/sagemaker/latest/dg/create-multi-model-endpoint.html. i created a multi model endpoint and invoked it as documented here https://docs.aws.amazon.com/sagemaker/latest/dg/invoke-multi-model-endpoint.html. I'm getting a Invalid model exception and the message => "model version is not defined" . my set up is , i have created two models , say modelOne and modelTwo.tar.gz file , and both models have their own custom script/inference.py file with following directory structure.

when we send request to a multi model endpoint, does sagemaker uncompresses the tar.gz file specified in the request, as in my case both models have save directory structure and same model.pth files inside the tar files , is it getting mixed up and not sure which one to invoke?

inference.py

import torch
import os

def model_fn(model_dir, context):
    model = Your_Model()
    with open(os.path.join(model_dir, 'model.pth'), 'rb') as f:
        model.load_state_dict(torch.load(f))
    return model

directory structure

model.tar.gz/
|- model.pth
|- code/
  |- inference.py
  |- requirements.txt  
response = runtime_sagemaker_client.invoke_endpoint(
                        EndpointName = "my-multi-model-endpoint",
                        ContentType  = "text/csv",
                        TargetModel  = "modelOne.tar.gz",
                        Body         = body)
질문됨 일 년 전130회 조회
답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠