Use Glue schema registry when reading from Kinesis

0

I want to store the schema for Avro formatted messages in Glue schema registry, and I want to use this schema when reading records from Kinesis data stream. Currently, for reading records from the stream, I'm using something like: avro_schemas = { "record1": """ { "type": "record", "name": "record1", "fields": [ {"name": "intField", "type": "int"}, {"name": "strField", "type": "string"} ] } """ }

dataframe = glueContext.create_data_frame.from_options( connection_type="kinesis", connection_options={ "typeOfData": "kinesis", "streamARN": <stream_arn>, "startingPosition": "latest", "classification": "avro", "inferSchema": "false", "avroSchema": avro_schema }, transformation_ctx=f"kinesis_data_frame" )

How can I read the schema from the registry and use it to create the data frame?

YK
asked 4 months ago642 views
1 Answer
0
Accepted Answer

You can use boto3 to get the schema: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue/client/get_schema_version.html
Normally you don't have to do that, you create a table based on the schema and then you use it in the streaming job.
Check this: https://docs.aws.amazon.com/glue/latest/dg/add-job-streaming.html#create-table-streaming

profile pictureAWS
EXPERT
answered 4 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions