Athena query consistently fails with HIVE_CURSOR_ERROR: Failed to read ORC file

0

I am seeing Athena queries over a bucket containing ORC files fail with the error message 'HIVE_CURSOR_ERROR: Failed to read ORC file'. Any query over entirety of the data in the bucket fails. A specific example query has been SELECT * FROM reachcounts_outbound WHERE calculation='a8d9458d-83e2-4e94-b272-3dbcd91296a0' where calculation is set up as a partition in the reachcounts_outbound table (which is backed by an S3 bucket unscoreit-reachcounts-outbound).

I've validated that the file referenced by the error message is a valid ORC file by downloading it and running orc-tools data on it, and the contents are what I'd expect. I've downloaded other ORC files in the bucket and compared them. They have the same schema and that schema is what I'd expect it to be; it matches the schema I've defined for the table.

I've tried deleting the individual file referenced when the error message first appeared. However, it continues to fail with the same message with a different file in the bucket. However if I specify a limit clause of any number under 1597894 on the query above, it will succeed.

I've tried running MSCK REPAIR TABLE on the reachcounts_outbound table. This did not change anything.

The query id of a request that caused a failure is 54480f27-1992-40f7-8240-17cc622f91db.

Thanks!

Update: The ORC files that are rejected all appear to have exactly 10,000 rows, which is the stride size for the file

mattlaq
asked 2 years ago2155 views
1 Answer
0

I had a similar situation and in my case the issue was that when executing the CREATE TABLE command to define the athena table, I did not include all the fields that are present in the lines in the ORC files as column names for the table. Once I checked that all the fields were listed, the query worked again just fine.

muloem
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions