- Newest
- Most votes
- Most comments
For Amazon Nova Pro in the Frankfurt region (which would fall under "other supported Regions" category), the default quota limit is 50 requests per minute (RPM). This is the maximum number of on-demand requests you can submit for model inference in one minute for Amazon Nova Pro. This quota considers the combined sum of Converse, ConverseStream, InvokeModel, and InvokeModelWithResponseStream operations.
For custom model deployments using Amazon Nova Pro as the base model, there are additional specific limits: 20 RPM and 80,000 TPM (tokens per minute) per custom model deployment.
It's important to note that the quota of 10,000 TPS (transactions per second) that you mentioned does not appear to be accurate for Nova Pro based on the available information.
If you need to increase these quotas, the process depends on whether the quota is adjustable. According to the information available, the on-demand model inference requests per minute for Amazon Nova Pro is marked as not adjustable. However, for other quotas that are adjustable, you can request an increase through the Service Quotas User Guide or by contacting AWS Support.
Sources
On-demand inference on Custom Models - Amazon Nova
Amazon Bedrock endpoints and quotas - AWS General Reference
How to increase AWS Bedrock quota for the specific model? | AWS re:Post
Request an increase for Amazon Bedrock quotas - Amazon Bedrock
Relevant content
- asked 3 months ago
- asked a month ago
- asked 3 months ago
- AWS OFFICIALUpdated 2 years ago
