Kinesis Data Stream via API Gateway: Observed 100k+ Records/Second. Investigating Unexpected Throughput.

0

I'm observing unexpectedly high throughput for my Kinesis Data Stream integration with API Gateway, and I'm seeking clarification on the metrics and possible explanations.

Setup:

  • Using Kinesis Data Stream with API Gateway REST API integration for PutRecords
  • Kinesis Data Stream: 8 shards, provisioned mode
  • API Gateway: Burst limit and rate limit both set to 5000

Expected behavior: Based on my understanding, each Kinesis shard can process 1000 records per second. With 8 shards, I expected a maximum throughput of 8000 records per second.

Observed behavior:

  • CloudWatch metrics for API Gateway's Count (1-second sum) exceed 100,000
  • Kinesis PutRecords' Records sum (1-second sum) also exceeds 100,000
  • No rate limiting or throttling events recorded for either Kinesis or API Gateway during this period

Questions:

  1. Am I misinterpreting the meaning of these metrics?
  2. How is it possible to achieve such high throughput given the setup described above?
  3. Are there any factors I'm overlooking that could explain this behavior?

I've attached screenshots of the relevant CloudWatch metrics for reference.

Any insights or explanations would be greatly appreciated. Thank you in advance for your help!

Kinesis PutRecords.Records Enter image description here

API GW Count Enter image description here

Kinesis Throttling History Enter image description here

1 個回答
2
已接受的答案

API Gateway sends metrics to CloudWatch every minute (see here) and so does Kinesis (see here).

So, selecting a period that is less than 1 minute is meaningless and would still provide you with the same value as selecting 1 minute.

When you select Statistics=Sum it presents you with the sum of all samples within the selected time range. if that time range is 1 minute (or less) it will contain only a single sample. So, Sum would be equal to SampleCount and contain the total count within 1 minute.

To get the per second rate you need to take the value and divide by 60. So in your example, the average number of Kinesis PutRecords and API Gateway Count per second is 100K/60 ~= 1666

profile pictureAWS
專家
已回答 6 個月前
profile picture
專家
已審閱 6 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南