Need solution for this Error: Row is not a valid JSON Object - JSONException: Duplicate key "lastupdated"

0

We imported MIMIC IV data (which is already in .ndjson format) into HealthLake data store & exported it, but at the time of import we found that it is importing the "lastupdated" column thrice. Also, when exported we're seeing it got updated thrice a time. While querying in Athena, it showed this error. If there any query present to remove duplicate keys from a table row, pls do share it. Also, if anyone found a solution on this error pls share. Thanks in advance.

Query Id: de44bccc-36af-488b-8c3d-bcf7e6d9360f

Row is not a valid JSON Object - JSONException: Duplicate key "lastupdated"

已提問 1 年前檢視次數 387 次
1 個回答
0

I understand that you are importing MIMIC IV data (which is already in .ndjson format) into HealthLake data store & exported it, but at the time of import you found that it is importing the "lastupdated" column thrice and hence you are getting the following error while running your Athena query:

—————

Row is not a valid JSON Object - JSONException: Duplicate key "lastupdated"

—————

Please note that Athena treats JSON key names as case insensitive, so this error is usually encountered when the underlying source data have multiple tags with the same name and some of the tags are in uppercase and others in lowercase. Athena does not allow for duplicate keys, hence the error you are seeing.

Therefore, in order to resolve this issue, you could modify the table settings to not be case sensitive and create a mapping for the problematic columns. Alternatively, you could create new table to test with the same DDL for the original table and implement these settings.

Using ALTER TABLE:

ALTER TABLE <yourTableName> SET TBLPROPERTIES (

'case.insensitive'='false',

'mapping.col'='Col_Name',

'mapping.write enabled'='Write Enabled')


Creating a new external table:

CREATE EXTERNAL TABLE <new_tablename> (

eventType string, ---> here provide your original table DDL

........

)

ROW FORMAT SERDE '..........'

WITH SERDEPROPERTIES (

'case.insensitive'='false', ------> this sets the case insensitivity

'mapping.Column_name'='New_Column_name' -----> here provide the mapping for the problematic column

)

LOCATION 's3://<YOUR BUCKET HERE>'


To get the DDL for the original table, you can run the below query:

SHOW CREATE TABLE table_name;

A similar kind of issue has been discussed in the following AWS documentations in detail as well.

-- https://aws.amazon.com/premiumsupport/knowledge-center/json-duplicate-key-error-athena-config/

https://stackoverflow.com/questions/53922517/duplicate-keys-with-amazon-athena-and-open-jsonx-serde

AWS
支援工程師
已回答 1 年前
AWS
專家
已審閱 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南