Use Glue schema registry when reading from Kinesis

0

I want to store the schema for Avro formatted messages in Glue schema registry, and I want to use this schema when reading records from Kinesis data stream. Currently, for reading records from the stream, I'm using something like: avro_schemas = { "record1": """ { "type": "record", "name": "record1", "fields": [ {"name": "intField", "type": "int"}, {"name": "strField", "type": "string"} ] } """ }

dataframe = glueContext.create_data_frame.from_options( connection_type="kinesis", connection_options={ "typeOfData": "kinesis", "streamARN": <stream_arn>, "startingPosition": "latest", "classification": "avro", "inferSchema": "false", "avroSchema": avro_schema }, transformation_ctx=f"kinesis_data_frame" )

How can I read the schema from the registry and use it to create the data frame?

YK
已提问 5 个月前667 查看次数
1 回答
0
已接受的回答

You can use boto3 to get the schema: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue/client/get_schema_version.html
Normally you don't have to do that, you create a table based on the schema and then you use it in the streaming job.
Check this: https://docs.aws.amazon.com/glue/latest/dg/add-job-streaming.html#create-table-streaming

profile pictureAWS
专家
已回答 5 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则