Hi. Is there a way of increasing AWS Bedrock quotas without purchasing Provisioned Throughput?

For example if you want to create an app that summarizes books without chaining prompts you are limited to Anthropic's models. In production, if you have some clients it will throttle as you would reach the 200,000 tokens per minute easily ( Is there an alternative to increase that limit without incurring in the huge costs of provisioned output? For example provisioning 1 model unit for Claude 2.1 costs around $45,990.00 :/. I guess the provisioned throughput works like EC2 instances, you are billed by the provisioned capacity not by the use, so although it is not used you would be billed that? It seems like there is not an intermediate option.


Hi Victor,

Currently there is no way to increase AWS Bedrock quotas without purchasing Provisioned Throughput. This has been asked commonly, and there are no workarounds besides Provisioned Throughput as of now.

I would recommend reaching out to your Account Team to have them push forward this request toward the service team.

Another workaround is that you could use a multi account strategy, as the token limit per minute is an account level limit. This is certainly not the best developer experience but might be a solution for now.

answered 10 months ago
reviewed 10 months ago

