Delivering records to Security Lake custom source prefix is not possible with Kinesis Firehose?

0

I'm trying to set up custom sources for Security Lake and have a Kinesis Firehose delivery stream configured to deliver parquet files into the Security Lake bucket under the ext/ prefix.

The problem I'm encountering is that the AWS::KinesisFirehose::DeliveryStream SchemaConfiguration (under CloudFormation AWS::KinesisFirehose::DeliveryStream Properties->ExtendedS3DestinationConfiguration->DataFormatConversionConfiguration->SchemaConfiguration) requires a table with schema matching the records in order to deliver the data, but Security Lake requires there to be records already there for Glue to crawl and create the table and schema which Firehose needs. It looks like it's not possible to directly stream data using Firehose when setting up Security Lake custom sources and that we need to move the records under the ext/ prefix using Glue or EMR.

Is my conclusion correct that this will not work? The fact that Security Lake uses a Glue crawler to create the table makes it less flexible when creating a custom source.

  • To summarize the catch 22:

    • Firehose requires a Glue Table & schema to be set up for delivering parquet records
    • Security Lake requires existing parquet records to crawl in order to create a Glue Table & schema
1 Answer
1

there maybe some alternative approach

AWS Kinesis Data Firehose to deliver data to S3 without converting it to Parquet, and then use Glue or Athena to convert and query the data later. One possible workaround could be to manually create a Glue table that matches the schema of the data you're streaming in. You could then point your Firehose delivery stream to this existing Glue table.

profile picture
EXPERT
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions