Kinesis Data transformation - using Lambda vs Glue

1

I am planning to move all the filtered logs from CloudWatch log group through Kinesis Firehose to an S3 bucket in parquet files. Given that CloudWatch log group always pushes gzipped data to Kinesis Firehose, I had to add a Lambda to unzip the data.

Now I am unsure, if the conversion of this filtered json data to Parquet should be done either by the Lambda (that is invoked to unzip the data) or should i convert it using the Glue table. Will it incur additional cost if I add AWS Glue to convert record format? Or if it is feasible to convert the data format in the Lambda itself? What are the pros and cons of using the either option?

I would appreciate some guidance on this.

2 Antworten
1

Hi Neisha,

For streaming the filtered logs through Kinesis Firehose to an S3 bucket in parquet files, it's preferred to use Glue Table to convert your JSON input data into Parquet format, as Kinesis Firehose has a well-defined integration with Glue Tables for Schema specification and Record conversion [1].
If your input data is in a format other than JSON, then you can use your lambda function to convert it into JSON first.

Choosing the route of converting format using the Lambda function itself will require developing your own logic and code for format conversion, and it might need more computation duration. So, Glue is ideally a go-to choice for such conversion (especially when you are working with Kinesis Firehose, which has integrations).
Some examples of achieving this are mentioned here [2] [3] for your reference.

Regarding the cost, you will have to pay for the Data Catalog storage and requests. See the pricing here [4].

References:
[1] https://docs.aws.amazon.com/firehose/latest/dev/record-format-conversion.html
[2] https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-kinesisfirehose-deliverystream.html#aws-resource-kinesisfirehose-deliverystream--examples
[3] https://catalog.us-east-1.prod.workshops.aws/workshops/2300137e-f2ac-4eb9-a4ac-3d25026b235f/en-US/lab-3-kdf/kinesis
[4] https://aws.amazon.com/glue/pricing/

Thanks,
Atul

profile picture
beantwortet vor 7 Monaten
0

Go for Glue as IBAtulAnand mentioned. Less overhead and no need to maintain the lambda

profile picture
EXPERTE
beantwortet vor 7 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen