- Mais recentes
- Mais votos
- Mais comentários
Thank you for contacting us regarding streaming fragmented data to your SageMaker endpoint. Based on your description, you mentioned that you are streaming data to your SageMaker Endpoint and the invocations are fragmented due to the payloads not being consolidated. As such, you wanted to create a global variable in the inference code to consolidate the payloads before invocation. However, looking into the CloudWatch logs, you see that the inference code is being initialized multiple times and would like to know if the inference script is being initialized on the separate vCPUs in the Endpoint when created and if so, does this mean that:
- Is the Endpoint running code on multiple separate vCPUs? If so, is the global variable shared across the parallel runs?
- If the code is running on multiple vCPUs, is there a way to force a single instance of the code to run, or to somehow share the variables across all the different vCPUs?
You are correct that SageMaker endpoints run on multiple vCPUs to handle inference requests in parallel, however looking into the documentation, it is not outlined if the inference script runs on all vCPUs in parallel. As such, I wouldn't be able to know if there is a way to force a single instance of the code to run, or how to share the global variables across all the different vCPUs.
Given the complexity around resolving this issue, I would recommend reaching out to AWS Premium Support. They can help troubleshoot your specific endpoint configuration and data pipelines to advise the best approach for sharing state. They can also assist with implementation if needed.
Please open a Premium Support case in the AWS Console or call 24/7 phone support. Reference this re:Post summary when you connect and an engineer can investigate further with you.
Conteúdo relevante
- AWS OFICIALAtualizada há 2 anos
- AWS OFICIALAtualizada há 2 anos
- AWS OFICIALAtualizada há 6 meses
- AWS OFICIALAtualizada há 2 anos