Bedrock API request limit

0

Hello,

We are currently working on a project that requires embedding some billion text sentences. To achieve this, we plan to use the Amazon Titan embedding model via Bedrock, specifically the model named amazon.titan-embed-text-v2:0.

However, we have encountered several limitations. The service quota indicates that the model can only be invoked 2000 times per minute, which is insufficient for our needs, and this limit cannot be adjusted. Additionally, we considered using provisioned throughput, but this model is not available in the list for that option.

We also explored batch processing, which allows a maximum batch size of 50,000. While this limit is manageable, we noticed that multiple batch processes seem to execute sequentially rather than in parallel, significantly increasing the overall processing time.

Could you please advise if there is any solution or workaround to address these constraints?

Thank you.

1 Answer
1

Hi,

The service quotas are expressed on a per account basis (yes, you're right: they are not adjustable).

What I do on my side in such case is work in multiple accounts to obtain N times the quota. If you consolidate then the result in one head account and do the downstream processing (i.e. beyond Bedrock) from this head account, the complexity added by multiple accounts is not that important and very easy to manage.

I usually use just Lambdas in the additional account and they write back their LLM completion to a common S3 bucket in head account. With a Lambda trigger on this bucket, you obtain a nice and very scalable architecture.

Best,

Didier

profile pictureAWS
EXPERT
answered 6 months ago
profile picture
EXPERT
reviewed 6 months ago
profile picture
EXPERT
reviewed 6 months ago
  • Thanks @Didier for the response, I will give it a go.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions