1 Answer
- Newest
- Most votes
- Most comments
0
Hi. It's hard to give a precise answer because it depends on a lot of factors. Just to give you a few examples, latency can depend on the region where the model is deployed, the prompt itself, and so on. However, I'd suggest you use e benchmarking tooling that you can find on github. I'm proving the repo link:
https://github.com/aws-samples/foundation-model-benchmarking-tool
The quotas can be found here:
https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html
answered a month ago
Relevant content
- Accepted Answerasked a month ago
- AWS OFFICIALUpdated 9 months ago
- AWS OFFICIALUpdated 4 months ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago
Can I just hit API for LLM and all my agents/knowledge bases/context-history can be separately stored in another cloud? Eg: How we hit bare OpenAI's APIs with LangChain and Knowledgebase and everything custom-built on our side