Access to model container logs for Sagemaker Async Endpoint

Question

I'm using the NVidia Triton deep learning container. When I configure using the standard endpoint it works fine, the cloud-watch log group `/aws/sagemaker/Endpoints/[EndpointName]` contain the container logs (i.e. messages written to the console from the inference script).

But using async-inference all I get is a single `[production-variant-name]/[instance-id]/data-log` containing the inforation from the async queue, i.e.

2024-04-22T01:59:25.220:[sagemaker logs] [9d5880e2-74fc-431a-b659-c126454b5cc5] Inference request succeeded. ModelLatency: 2267959 us, RequestDownloadLatency: 433665 us, ResponseUploadLatency: 148004 us, TimeInBacklog: 680581 ms, TotalProcessingTime: 683482 ms

This makes it really hard to diagnose issues - how do I access the actual logs from the container when running in async mode?

Answer

Hello,

Thank you for using Amazon SageMaker.

At the moment, `[production-variant-name]/[instance-id]/data-log` are [all the logs provided by Amazon SageMaker](https://docs.aws.amazon.com/sagemaker/latest/dg/logging-cloudwatch.html) for asynchronous endpoints.

I have raised a feature request on your behalf to include the model container logs for async endpoints. While I am unable to comment on if/when this feature may get released, I request you to keep an eye on our [What's New ](https://aws.amazon.com/new/?whats-new-content-all.sort-by=item.additionalFields.postDateTime&whats-new-content-all.sort-order=desc&awsf.whats-new-categories=*all) and [Blog](https://aws.amazon.com/blogs/aws/) pages for any new feature announcements.

Access to model container logs for Sagemaker Async Endpoint

相关内容