Need solution for this Error: Row is not a valid JSON Object - JSONException: Duplicate key "lastupdated"

0

We imported MIMIC IV data (which is already in .ndjson format) into HealthLake data store & exported it, but at the time of import we found that it is importing the "lastupdated" column thrice. Also, when exported we're seeing it got updated thrice a time. While querying in Athena, it showed this error. If there any query present to remove duplicate keys from a table row, pls do share it. Also, if anyone found a solution on this error pls share. Thanks in advance.

Query Id: de44bccc-36af-488b-8c3d-bcf7e6d9360f

Row is not a valid JSON Object - JSONException: Duplicate key "lastupdated"

질문됨 일 년 전387회 조회
1개 답변
0

I understand that you are importing MIMIC IV data (which is already in .ndjson format) into HealthLake data store & exported it, but at the time of import you found that it is importing the "lastupdated" column thrice and hence you are getting the following error while running your Athena query:

—————

Row is not a valid JSON Object - JSONException: Duplicate key "lastupdated"

—————

Please note that Athena treats JSON key names as case insensitive, so this error is usually encountered when the underlying source data have multiple tags with the same name and some of the tags are in uppercase and others in lowercase. Athena does not allow for duplicate keys, hence the error you are seeing.

Therefore, in order to resolve this issue, you could modify the table settings to not be case sensitive and create a mapping for the problematic columns. Alternatively, you could create new table to test with the same DDL for the original table and implement these settings.

Using ALTER TABLE:

ALTER TABLE <yourTableName> SET TBLPROPERTIES (

'case.insensitive'='false',

'mapping.col'='Col_Name',

'mapping.write enabled'='Write Enabled')


Creating a new external table:

CREATE EXTERNAL TABLE <new_tablename> (

eventType string, ---> here provide your original table DDL

........

)

ROW FORMAT SERDE '..........'

WITH SERDEPROPERTIES (

'case.insensitive'='false', ------> this sets the case insensitivity

'mapping.Column_name'='New_Column_name' -----> here provide the mapping for the problematic column

)

LOCATION 's3://<YOUR BUCKET HERE>'


To get the DDL for the original table, you can run the below query:

SHOW CREATE TABLE table_name;

A similar kind of issue has been discussed in the following AWS documentations in detail as well.

-- https://aws.amazon.com/premiumsupport/knowledge-center/json-duplicate-key-error-athena-config/

https://stackoverflow.com/questions/53922517/duplicate-keys-with-amazon-athena-and-open-jsonx-serde

AWS
지원 엔지니어
답변함 일 년 전
AWS
전문가
검토됨 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠