ThrottlingException on AWS Bedrock when using meta.llama3-70b-instruct-v1:0

1

Hi,

When I use meta.llama3-70b-instruct-v1:0 i consistently get throttlingExceptions. I am nowhere near the limit (in fact, every request I make gets throttled). I have checked by enabling cloudwatch logs and s3 logs, and there are no requests getting through. if I switch to any other model everything works fine.

Hristo
preguntada hace un mes288 visualizaciones
3 Respuestas
0

Hello, ThrottlingException while invoking models in on-demand mode, despite requests being below the documented quota limit can arise because the on-demand mode utilizes a shared capacity pool across multiple customers. Consequently, during periods of high demand when the base model processes a substantial number of requests, throttling may occur even if you have the necessary limits in place.

It's important to note that individual accounts can be throttled below their expected rates due to the shared capacity pool being utilized by all customers during high-demand periods. The internal team is actively working on long-term solutions to expand capacity and address this issue, but a specific timeline is currently unavailable.

To mitigate this issue, you can consider implementing retry mechanisms or exponential backoffs. However, switching to provisioned throughput might be the most effective option, as it provides reserved capacity specifically for your account. This approach ensures consistent performance by avoiding the inherent peaks and valleys of the on-demand mode.

Additionally, you could try using a different AWS region to see if that alleviates the throttling issues.

If further assistance is needed please feel free to reach out to AWS Support.

zeekg
respondido hace un mes
0

Have had the same issue with Llama 3. Had to pull it from our production application because of this. No other models have been an issue. Happy to find this question after wasting 3 days with the support center. Thank you OP and Zeekg

respondido hace un mes
0

only issue here is that you cannot provision llama 3.8 capacity as of now. hopefully this gets fixed one way or the other.

Hristo
respondido hace un mes

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas