Kinesis Data Stream via API Gateway: Observed 100k+ Records/Second. Investigating Unexpected Throughput.

0

I'm observing unexpectedly high throughput for my Kinesis Data Stream integration with API Gateway, and I'm seeking clarification on the metrics and possible explanations.

Setup:

  • Using Kinesis Data Stream with API Gateway REST API integration for PutRecords
  • Kinesis Data Stream: 8 shards, provisioned mode
  • API Gateway: Burst limit and rate limit both set to 5000

Expected behavior: Based on my understanding, each Kinesis shard can process 1000 records per second. With 8 shards, I expected a maximum throughput of 8000 records per second.

Observed behavior:

  • CloudWatch metrics for API Gateway's Count (1-second sum) exceed 100,000
  • Kinesis PutRecords' Records sum (1-second sum) also exceeds 100,000
  • No rate limiting or throttling events recorded for either Kinesis or API Gateway during this period

Questions:

  1. Am I misinterpreting the meaning of these metrics?
  2. How is it possible to achieve such high throughput given the setup described above?
  3. Are there any factors I'm overlooking that could explain this behavior?

I've attached screenshots of the relevant CloudWatch metrics for reference.

Any insights or explanations would be greatly appreciated. Thank you in advance for your help!

Kinesis PutRecords.Records Enter image description here

API GW Count Enter image description here

Kinesis Throttling History Enter image description here

1 回答
2
已接受的回答

API Gateway sends metrics to CloudWatch every minute (see here) and so does Kinesis (see here).

So, selecting a period that is less than 1 minute is meaningless and would still provide you with the same value as selecting 1 minute.

When you select Statistics=Sum it presents you with the sum of all samples within the selected time range. if that time range is 1 minute (or less) it will contain only a single sample. So, Sum would be equal to SampleCount and contain the total count within 1 minute.

To get the per second rate you need to take the value and divide by 60. So in your example, the average number of Kinesis PutRecords and API Gateway Count per second is 100K/60 ~= 1666

profile pictureAWS
专家
已回答 7 个月前
profile picture
专家
已审核 7 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则