Sagemaker hourly inference

0

I want to use multiple (possibly 100s) Sagemaker RCF models for regular inferences at 1 hour intervals.

The solution for real time inference is to create a multi-model endpoint which can house many models and then call this endpoint on demand. I don’t think this would be suitable for our use case since the endpoint would be unused for 59 minutes and then overloaded with requests for 1 minutes.

The other solution I have seen is batch transform but my understanding is that this is better suited for infrequent and large batch jobs.

The ideal solution would be for me to train and output the RCF model artifacts (i.e. .tar.gz files) and then pull and create the models on demand (within my own lambda or batch process) without having to use endpoints but I can’t find a way to run inference with the AWS RCF model without having to deploy endpoints first.

If anyone could suggest either a way around using endpoints or a way to set up and call these endpoints in an efficient way for hourly inference that would be greatly appreciated.

Many thanks

shaunak
질문됨 9달 전175회 조회
1개 답변
1

Hi,

given your use case, yes, batch transform jobs are the way to go: you accumulate your input, start the model, run the inferences and stop the model. Since you inferences are infrequent, it's important to stop the engine when you're done with current set of inferences to remain most frugal and cost-efficient.

Question: to further reduce your costs, can you infer less frequently than every hour? Let's say 4 times a day ?

To achieve this level of efficiency, it means that you should develop a fully automated MLOps pipeline: see https://github.com/aws-samples/amazon-sagemaker-safe-deployment-pipeline for a full example with code to implement the below

Enter image description here

Best,

Didier

profile pictureAWS
전문가
답변함 9달 전
  • Hi Didier, Thank you for your answer! Documentation doesn't seem to be that available for our particular use case, as per my understanding batch transforms don't inherently support multi-model-endpoints. What you think is a workaround to this would be? Best

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠