1 Answer
- Newest
- Most votes
- Most comments
0
Hi, Sagemaker Serverless Inference proposes optimal costs for low traffic due to its serverless nature: https://docs.aws.amazon.com/sagemaker/latest/dg/serverless-endpoints.html
For Serverless Inference with Provisioned Concurrency, you pay for the compute
capacity used to process inference requests, billed by the millisecond, and the
amount of data processed. You also pay for Provisioned Concurrency usage,
based on the memory configured, duration provisioned, and the amount of
concurrency enabled.
Pricing is detailled here: https://aws.amazon.com/sagemaker/pricing/
Hope it helps!
Didier
Relevant content
- asked 9 months ago
- asked 8 months ago
- asked 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 11 days ago
- AWS OFFICIALUpdated 8 months ago
- AWS OFFICIALUpdated a year ago