Athena query consistently fails with HIVE_CURSOR_ERROR: Failed to read ORC file

0

I am seeing Athena queries over a bucket containing ORC files fail with the error message 'HIVE_CURSOR_ERROR: Failed to read ORC file'. Any query over entirety of the data in the bucket fails. A specific example query has been SELECT * FROM reachcounts_outbound WHERE calculation='a8d9458d-83e2-4e94-b272-3dbcd91296a0' where calculation is set up as a partition in the reachcounts_outbound table (which is backed by an S3 bucket unscoreit-reachcounts-outbound).

I've validated that the file referenced by the error message is a valid ORC file by downloading it and running orc-tools data on it, and the contents are what I'd expect. I've downloaded other ORC files in the bucket and compared them. They have the same schema and that schema is what I'd expect it to be; it matches the schema I've defined for the table.

I've tried deleting the individual file referenced when the error message first appeared. However, it continues to fail with the same message with a different file in the bucket. However if I specify a limit clause of any number under 1597894 on the query above, it will succeed.

I've tried running MSCK REPAIR TABLE on the reachcounts_outbound table. This did not change anything.

The query id of a request that caused a failure is 54480f27-1992-40f7-8240-17cc622f91db.

Thanks!

Update: The ORC files that are rejected all appear to have exactly 10,000 rows, which is the stride size for the file

mattlaq
preguntada hace 2 años2163 visualizaciones
1 Respuesta
0

I had a similar situation and in my case the issue was that when executing the CREATE TABLE command to define the athena table, I did not include all the fields that are present in the lines in the ORC files as column names for the table. Once I checked that all the fields were listed, the query worked again just fine.

muloem
respondido hace 2 años

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas