Athena query failing

0

HIVE_CANNOT_OPEN_SPLIT: Error opening Hive split s3://vsgdev/amahal797/year_p=2022/month_p=11/day_p=28/file.parquet (offset=603979776, length=33554432): Multiple entries with same key: [rpdname] optional binary =min: , max: d11_walledgarden_v6.cm, num_nulls not defined and [rpdname] optional binary =min: , max: SNTCDSFW03, num_nulls not defined

The Database table was populated using S3 parquet file.

any idea what this error means?

질문됨 일 년 전340회 조회
1개 답변
0

The error that you are facing occurs mostly due to the presence of duplicates in the file that the Athena was trying to fetch. Upon looking at the error message that you shared, I can say that the file.parquet present in your s3 location consists of duplicate parameter 'rpdname' which might be causing the issue.

Therefore, to mitigate this issue, I suggest you to check the contents of file.parquet for the duplicates and then remove/modify one of the 'rpdname' fields and then rerun the query. Basically, I think that the issue is with the underlying data but not the query. Thus check and clean the data before you run the query.

profile pictureAWS
지원 엔지니어
Chaitu
답변함 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠