- Newest
- Most votes
- Most comments
Hi Neisha,
For streaming the filtered logs through Kinesis Firehose to an S3 bucket in parquet files, it's preferred to use Glue Table to convert your JSON input data into Parquet format, as Kinesis Firehose has a well-defined integration with Glue Tables for Schema specification and Record conversion [1].
If your input data is in a format other than JSON, then you can use your lambda function to convert it into JSON first.
Choosing the route of converting format using the Lambda function itself will require developing your own logic and code for format conversion, and it might need more computation duration. So, Glue is ideally a go-to choice for such conversion (especially when you are working with Kinesis Firehose, which has integrations).
Some examples of achieving this are mentioned here [2] [3] for your reference.
Regarding the cost, you will have to pay for the Data Catalog storage and requests. See the pricing here [4].
References:
[1] https://docs.aws.amazon.com/firehose/latest/dev/record-format-conversion.html
[2] https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-kinesisfirehose-deliverystream.html#aws-resource-kinesisfirehose-deliverystream--examples
[3] https://catalog.us-east-1.prod.workshops.aws/workshops/2300137e-f2ac-4eb9-a4ac-3d25026b235f/en-US/lab-3-kdf/kinesis
[4] https://aws.amazon.com/glue/pricing/
Thanks,
Atul
Go for Glue as IBAtulAnand mentioned. Less overhead and no need to maintain the lambda
Relevant content
- Accepted Answerasked a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago