How do I troubleshoot a "429 Throttling" error when I use Amazon Bedrock on-demand resources?

3 minute read
3

I want to troubleshoot the "429 Throttling" error that I receive when I use Amazon Bedrock on-demand resources.

Short description

Amazon Bedrock returns a ThrottlingException (HTTP Status Code: 429) when your requests are denied because you exceeded your AWS account quotas. You receive an error message on the client-side that's similar to the following ones:

  • "Too many requests, please wait before trying again. You have sent too many requests. Wait before trying again."
  • "Your request rate is too high. Reduce the frequency of requests."
  • "Too many tokens, please wait before trying again."

To resolve this issue, complete the following troubleshooting steps for your use case.

Resolution

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.

Verify and monitor AWS service quotas

Confirm that you're not exceeding your Amazon Bedrock service quotas. For more information, see Viewing service quotas.

To make sure that your application's request volume doesn't exceed the quotas, use Amazon CloudWatch to monitor the InputTokenCount and Invocations Amazon Bedrock runtime metrics. Each metric measures per minute.

Retry the request

It's a best practice to use retries with exponential backoff and random jitter. If you use AWS SDKs, then see Retry behavior.

Use cross-Region inference profiles

Use cross-Region inference profiles to dynamically route traffic across multiple AWS Regions for optimal availability for each request and better performance for high-usage periods. For more information, see the code sample for cross-Region interference on the amazon-bedrock-workshop on the GitHub website.

Note: To use cross-Region features, you must use a Region and model that Amazon Bedrock supports.

Use Provisioned Throughput

If you have high throughput requirements, then purchase Provisioned Throughput. To use Provisioned Throughput with the Amazon Bedrock console, see Use a Provisioned Throughput with an Amazon Bedrock resource. To use Provisioned Throughput with the AWS CLI or Python SDK, see Code examples for Provisioned Throughput.

Note: Before you purchase Provisioned Throughput, make sure that you're using a Region and model that Amazon Bedrock supports.

Request quota increase

If your workload traffic exceeds your account's on-demand quotas, then contact AWS Support or your account manager to request a quota increase. In your request, include the following information:

  • The name of the quota that you want to increase
  • The model's ID
  • The Region for the quota increase
AWS OFFICIAL
AWS OFFICIALUpdated 17 days ago