- Newest
- Most votes
- Most comments
Bedrock Provisioned Throughput allows you to reserve a certain amount of processing power for your application to avoid throttling and ensure consistent performance.
What does a model unit exactly mean?
Think of a model unit as a "portion" of computational resources dedicated to your application.
How many tokens per minute is each model unit capable of processing?
Each model unit can handle a certain amount of work, like processing a number of requests or tokens per minute. It's best to refer to the documentation or specifications provided by the service you are using to get the exact details for the model unit you are interested in.
Can i modify the quotas of each model unit?
You can't manually change how much work a model unit can handle. If you need more capacity, you'll need to add more model units.
What do these costs depend on? even if i do not use it will i be billed?
The costs for model units depend on the resources allocated to them. Even if you don't use the full capacity of a model unit, you'll still be charged based on the allocation.
Once you purchase a model unit for a month, you can't cancel it until the month ends.
When would i need more than 1 model unit?
You might need more than one model unit if your application needs to handle a higher workload than what a single unit can manage. This could be because of more requests or more complex processing needs.
Thanks! so it is like launching an instance for the model of your choose. I do not see anywhere specified the computational resources or the tokens per minute of any model on any part of the AWS documentation. I just found a section which says 'For more information about what an MU specifies, contact your AWS account manager'. It also says that provisioned throughput quotas are adjustable so it is not very clear, thats why i asked when would i need to purchase 2 MU if quotas are adjustable.
To better understand the costs, you can utilize Amazon's cost estimator at https://calculator.aws/#/createCalculator/bedrock. This tool provides detailed insights into the pricing granularity for the features you intend to use. However, please note that some models may not be available in the estimator yet. For such cases, refer to the pricing details at https://aws.amazon.com/bedrock/pricing/ and the runtime quotas listed at https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html#quotas-runtime.
Relevant content
- Accepted Answerasked 6 months ago
- asked 7 months ago
- AWS OFFICIALUpdated 3 months ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
I just found a section which says 'For more information about what an MU specifies, contact your AWS account manager'. It also says that provisioned throughput quotas are adjustable so it is not very clear, thats why i asked when would i need to purchase 2 MU if quotas are adjustable.