使用脚本HudiInitLoadNYTaxiData.py的Glue作业hudi-init-load-job失败。

0

【以下的问题经过翻译处理】 你好。我们正在进行一种POC,并正在评估Glue的能力。作为评估的一部分,我最近激活了最新版本的"基于Apache Hudi的AWS Glue连接器(版本为0.9.0)"(https://aws.amazon.com/marketplace/pp/prodview-zv3vmwbkuat2e?ref_=beagle&applicationId=GlueStudio)。具体而言,我正在谈论来自该文章(https://aws.amazon.com/blogs/big-data/writing-to-apache-hudi-tables-using-aws-glue-connector/)的步骤实现的结果。我们当前不使用AWS Lake Formation。因此,我成功地实施了除与AWS Lake Formation相关的部分之外的每一步。一旦我成功完成了CloudFormation的工作,我就启动了hudi-init-load-job JOB。但结果有点令人沮丧!作业失败,并显示以下结果:

2021-12-24 08:50:56,249 ERROR [main] glue.ProcessLauncher (Logging.scala:logError(70)): Error from Python: Traceback (most recent call last): File "/tmp/HudiInitLoadNYTaxiData.py", line 27, in <module> glueContext.write_dynamic_frame.from_options(frame = DynamicFrame.fromDF(inputDf, glueContext, "inputDf"), connection_type = "marketplace.spark", connection_options = combinedConf) File "/opt/amazon/lib/python3.6/site-packages/awsglue/dynamicframe.py", line 653, in from_options format_options, transformation_ctx) File "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", line 279, in write_dynamic_frame_from_options format, format_options, transformation_ctx) File "/opt/amazon/lib/python3.6/site-packages/awsglue/context.py", line 302, in write_from_options return sink.write(frame_or_dfc) File "/opt/amazon/lib/python3.6/site-packages/awsglue/data_sink.py", line 35, in write return self.writeFrame(dynamic_frame

profile picture
EXPERTE
gefragt vor 6 Monaten11 Aufrufe
1 Antwort
0

【以下的回答经过翻译处理】 我刚刚在 hudi-init-load-job 作业中更改了 Glue 的版本。一开始是 Glue 2.0,但我切换到了 3.0,作业成功了!

profile picture
EXPERTE
beantwortet vor 6 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen