Athena query failing

0

HIVE_CANNOT_OPEN_SPLIT: Error opening Hive split s3://vsgdev/amahal797/year_p=2022/month_p=11/day_p=28/file.parquet (offset=603979776, length=33554432): Multiple entries with same key: [rpdname] optional binary =min: , max: d11_walledgarden_v6.cm, num_nulls not defined and [rpdname] optional binary =min: , max: SNTCDSFW03, num_nulls not defined

The Database table was populated using S3 parquet file.

any idea what this error means?

posta un anno fa340 visualizzazioni
1 Risposta
0

The error that you are facing occurs mostly due to the presence of duplicates in the file that the Athena was trying to fetch. Upon looking at the error message that you shared, I can say that the file.parquet present in your s3 location consists of duplicate parameter 'rpdname' which might be causing the issue.

Therefore, to mitigate this issue, I suggest you to check the contents of file.parquet for the duplicates and then remove/modify one of the 'rpdname' fields and then rerun the query. Basically, I think that the issue is with the underlying data but not the query. Thus check and clean the data before you run the query.

profile pictureAWS
TECNICO DI SUPPORTO
Chaitu
con risposta un anno fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande