- Newest
- Most votes
- Most comments
Hello.
I've tried on the model page on the website and it seems that just Claude 3.5 sonnet is unavailable. I'm aware that there is a shared compute pool for processing On Demand requests but since Provisioned Throughput isn't available for claude 3.5 sonnet yet this is the only option. Do I just have to weather these service disruptions every single time?
I think you're right.
As of July 2024, provisioned throughput mode cannot be purchased for Claude 3.5 Sonnet, so when a throttling error occurs, I think it would be necessary to accept the error or try again after a while.
https://docs.aws.amazon.com/bedrock/latest/userguide/pt-supported.html
Hi,
Based on documentation https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html current inference quota for Sonnet v3.5 is 50 reqs and 400'000 tokens per minute (not adjustable).
So, yes, after that, expect Throttling exceptions that you have to manage by retries: I have that in one application that I work on currently.
Best,
Didier
Relevant content
- asked 7 months ago
- Accepted Answerasked 3 months ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 9 months ago
- AWS OFFICIALUpdated 3 months ago
If this is the case how do I deploy anything I make with Claude 3.5 Sonnet on AWS Bedrock for production?
When using it in a production environment, I think it would be a good idea to allow a period of time and then retry when a throttling error occurs.