- Newest
- Most votes
- Most comments
Hello, I understand that you are seeing ThrottlingException for invoking Claude v2.1 model in on-demand mode, even through the requests are lesser than the quota limit as per the documentation. Let me address each of your queries below-
- Why did we get these ThrottllingExceptions?
In case of on-demand mode, a shared capacity pool will be used across multiple customers. So at times when the demand is high and base model is processing a large number of requests, there will be a possibility of throttling even though you may have the necessary limits in place.
- Can these ThrottlingExceptions occur even when individual quota limits are not reached just because throughput is shared across all customers?
Yes, please note that since on-demand models make use of a shared capacity pool, during periods of high demand across the service, individual accounts may be throttled below their expected rates. Kindly note that the internal team is working on long-term fixes to expand capacity and address this issue, but we currently do not have an ETA in place.
- Is switching to provisioned throughput the only option to mitigate this issue?
You can also try using retry mechanisms/ exponential backoffs to mitigate throttling. That being said, it is suggested to consider provisioned throughput as it provides reserved capacity for your account specifically, so you can avoid the inherent peaks and valleys of on-demand and maintain a consistent level of performance [1].
I hope you found this helpful. If you face any other issues or require further assistance, please reach out to AWS Support [2] along with your use case details, and we would be happy to assist you further. Thank you!
References:
[1] https://docs.aws.amazon.com/bedrock/latest/userguide/prov-throughput.html
[2] https://console.aws.amazon.com/support/home#/case/create
Relevant content
- Accepted Answerasked 6 months ago
- AWS OFFICIALUpdated 6 months ago
- AWS OFFICIALUpdated 8 months ago
- AWS OFFICIALUpdated 5 months ago
- AWS OFFICIALUpdated 5 months ago