Unable to read Hive Acid tables in Athena using Athena Hive data connector

0

Hi, We are trying to use Athena as our consumption service. We have migrated most of the hive databases/tables from external Hive meta store to AWS Glue except those database that has Hive ACID tables because Glue don't support Hive ACID tables. To read Hive ACID tables from Athena, we have configured Athena connector for Hive based this article https://docs.aws.amazon.com/athena/latest/ug/connect-to-data-source-hive.html and used AthenaHiveMetastoreFunctionWithLayer jar.

When try to query Hive ACID table (based on ORC file format ) from Athena using newly created custom catalog for Hive, I get below error.

"HIVE_CURSOR_ERROR: Failed to read ORC file: s3://my-datalake-bkt-dev/test/acid/ug/base_0000002/bucket_00000"

It looks like Athena not able to read the hive ACID file format. Can some one please help me?

RamSet
已提問 2 年前檢視次數 804 次
1 個回答
0
已接受的答案

Hello,

The latest Athena engine v2 uses Presto 0.217 which does not support Hive ACID tables. I tried to use this article and this to test it out and got the below error

HIVE_INVALID_BUCKET_FILES: Hive table 'default.acid_tbl' is corrupt. Found sub-directory in bucket directory for partition: 

Presto appears to only supports reading ACID tables starting from Presto 331

However as per this doc ,Athena do support ACID transactions via AWS Lakeformation Governed tables or Icerberg. If you are looking to move your Hive ACID tables to AWS, then I would suggest you to check on the AWS LakeFormation governed tables feature which uses the same Glue catalog.

Ref: AWS lakeformation governed tables blog series

https://aws.amazon.com/blogs/big-data/part-1-effective-data-lakes-using-aws-lake-formation-part-1-getting-started-with-governed-tables/

AWS
支援工程師
已回答 2 年前
profile picture
專家
已審閱 1 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南