When using Kinesis Producer Library to aggregate records, send them to a Kinesis Data Stream, and run a Kinesis Data Analytics application, how is de-aggregation handled?

0

The Kinesis Producer Library aggregates records before sending them to a Kinesis Data Stream.

When consuming records from that Data Stream with the Kinesis Consumer Library, record de-aggregation is automatic.

For consuming records from that Data Stream with a Kinesis Data Analytics application, how is de-aggregation handled?

  • Does the Data Analytics application de-aggregate records automatically?
  • Does this case require running the Kinesis Consumer Library inside a Data Analytics application?
  • Something else?
1 Answer
0

So this is a depends question. The older Flink producer(org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer) for kinesis did use de-aggregation by default, I believe, but there was a Flink configuration to turn it off. The newer producer does not use de-aggregation. I think that the version of KDA currently running today uses the older Flink Producer. https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/kinesis/

As for consumers receiving these de-aggregated records, we use Lambda's written in GoLang to read the de-aggregated records and write them to our data store. We used this in our consumer to help pull out the records: https://github.com/awslabs/kinesis-aggregation

profile picture
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions