Delivering records to Security Lake custom source prefix is not possible with Kinesis Firehose?

0

I'm trying to set up custom sources for Security Lake and have a Kinesis Firehose delivery stream configured to deliver parquet files into the Security Lake bucket under the ext/ prefix.

The problem I'm encountering is that the AWS::KinesisFirehose::DeliveryStream SchemaConfiguration (under CloudFormation AWS::KinesisFirehose::DeliveryStream Properties->ExtendedS3DestinationConfiguration->DataFormatConversionConfiguration->SchemaConfiguration) requires a table with schema matching the records in order to deliver the data, but Security Lake requires there to be records already there for Glue to crawl and create the table and schema which Firehose needs. It looks like it's not possible to directly stream data using Firehose when setting up Security Lake custom sources and that we need to move the records under the ext/ prefix using Glue or EMR.

Is my conclusion correct that this will not work? The fact that Security Lake uses a Glue crawler to create the table makes it less flexible when creating a custom source.

  • To summarize the catch 22:

    • Firehose requires a Glue Table & schema to be set up for delivering parquet records
    • Security Lake requires existing parquet records to crawl in order to create a Glue Table & schema
1 個回答
1

there maybe some alternative approach

AWS Kinesis Data Firehose to deliver data to S3 without converting it to Parquet, and then use Glue or Athena to convert and query the data later. One possible workaround could be to manually create a Glue table that matches the schema of the data you're streaming in. You could then point your Firehose delivery stream to this existing Glue table.

profile picture
專家
已回答 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南