Lambda Provisioned Concurrency Metrics


I have an isolated account with a lambda function that has provisioned concurrency (PC) set to 500 and PC autoscaling on 06. The autoscale min capacity is set to 500, and the autoscale max capacity is set to 1000. On a load test, the lambda metrics show:

  • Concurrent execution ~189
  • Provisioned concurrent execution ~144
  • Invocation ~ 427,698
  • PC invocation ~300,504
  • PC spillover invocation ~127,194
  • PC utilization ~28.5% What I don't understand is why spillover when PC utilization is only at 28%? I would assume this spillover invocation will have a cold start. which I try to avoid using PC, but the PC was not 100% utilized yet. does that mean that even with PC we can not avoid cold start?
asked a year ago500 views
2 Answers
Accepted Answer

What is your rate of invocations? Is it more than 5000/sec? If so, you are hitting the Invocations per Second limit, which is set to 10 times the number of configured provisioned concurrency. In your case 10*500=5000 invocations/sec.

profile pictureAWS
answered a year ago
profile picture
reviewed a year ago

The presence of spillover invocations in your scenario indicates that your provisioned concurrency (PC) is not sufficient to handle the current load. While the PC utilization is only at 28.5%, it's important to note that this metric represents the ratio of the provisioned concurrent executions being used to the total provisioned concurrency. It doesn't necessarily reflect the actual demand or the number of concurrent invocations at any given moment.

In your case, the load test shows that you had a peak of 189 concurrent executions, but your provisioned concurrency was set to 500. This means that during the test, there were instances where the available provisioned concurrency was fully utilized, resulting in spillover invocations. These spillover invocations occur when the provisioned concurrency is exhausted, and additional requests cannot be immediately served by existing instances.

Cold starts can still happen with provisioned concurrency, but their occurrence is minimized compared to using on-demand concurrency. When a cold start occurs, it means that the lambda function needs to initialize a new execution environment to handle the incoming request. With provisioned concurrency, you can pre-warm a certain number of instances to minimize the impact of cold starts, but if the demand exceeds the provisioned concurrency, spillover invocations may experience cold starts.

To address the spillover invocations and potential cold starts, you have a few options:

Increase Provisioned Concurrency: If the load test consistently exceeds the provisioned concurrency, consider increasing the provisioned concurrency limit to better accommodate the peak demand and minimize spillover invocations.

Adjust Auto Scaling Parameters: Review your auto scaling configuration and ensure that the min and max capacity are set appropriately. If the current settings are not effectively scaling to meet the demand, you may need to fine-tune these parameters to better align with your application's requirements.

Monitor and Analyze Load Patterns: Understand the patterns and fluctuations in your application's load. Analyze the metrics over time to identify peak usage periods and adjust your provisioned concurrency and auto scaling settings accordingly.

By optimizing the provisioned concurrency and auto scaling parameters based on your application's load patterns, you can better utilize provisioned concurrency and minimize spillover invocations and potential cold starts.

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions