class org.apache.parquet.io.GroupColumnIO cannot be cast to class org.apache.parquet.io.PrimitiveColumnIO

0

Using Athena on an s3 bucket that's been crawled and get the error: class org.apache.parquet.io.GroupColumnIO cannot be cast to class org.apache.parquet.io.PrimitiveColumnIO I've narrowed down the column that's causing it, but it's classified as array<Int> in parquet and in the glue database table.

I'm streaming Auth0 logs to an event bus, turning that into a parquet file and saving it in S3. I then crawl the S3 bucket and use Athena to analyse the data. Query Id: b010574d-f708-4855-abde-66dbd90cb7d7

python script for turning event into parquet:

s3_path = "s3://{}/{}/{}.parquet".format(s3_bucket_name, event_actual_date, event_id) df = pd.DataFrame(pd.json_normalize(event, sep="_")) awswrangler.s3.to_parquet(df, dataset=False, path=s3_path, index=False)

Marc
질문됨 9달 전81회 조회
답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠