Need solution for this Error: Row is not a valid JSON Object - JSONException: Duplicate key "lastupdated"

0

We imported MIMIC IV data (which is already in .ndjson format) into HealthLake data store & exported it, but at the time of import we found that it is importing the "lastupdated" column thrice. Also, when exported we're seeing it got updated thrice a time. While querying in Athena, it showed this error. If there any query present to remove duplicate keys from a table row, pls do share it. Also, if anyone found a solution on this error pls share. Thanks in advance.

Query Id: de44bccc-36af-488b-8c3d-bcf7e6d9360f

Row is not a valid JSON Object - JSONException: Duplicate key "lastupdated"

已提问 1 年前387 查看次数
1 回答
0

I understand that you are importing MIMIC IV data (which is already in .ndjson format) into HealthLake data store & exported it, but at the time of import you found that it is importing the "lastupdated" column thrice and hence you are getting the following error while running your Athena query:

—————

Row is not a valid JSON Object - JSONException: Duplicate key "lastupdated"

—————

Please note that Athena treats JSON key names as case insensitive, so this error is usually encountered when the underlying source data have multiple tags with the same name and some of the tags are in uppercase and others in lowercase. Athena does not allow for duplicate keys, hence the error you are seeing.

Therefore, in order to resolve this issue, you could modify the table settings to not be case sensitive and create a mapping for the problematic columns. Alternatively, you could create new table to test with the same DDL for the original table and implement these settings.

Using ALTER TABLE:

ALTER TABLE <yourTableName> SET TBLPROPERTIES (

'case.insensitive'='false',

'mapping.col'='Col_Name',

'mapping.write enabled'='Write Enabled')


Creating a new external table:

CREATE EXTERNAL TABLE <new_tablename> (

eventType string, ---> here provide your original table DDL

........

)

ROW FORMAT SERDE '..........'

WITH SERDEPROPERTIES (

'case.insensitive'='false', ------> this sets the case insensitivity

'mapping.Column_name'='New_Column_name' -----> here provide the mapping for the problematic column

)

LOCATION 's3://<YOUR BUCKET HERE>'


To get the DDL for the original table, you can run the below query:

SHOW CREATE TABLE table_name;

A similar kind of issue has been discussed in the following AWS documentations in detail as well.

-- https://aws.amazon.com/premiumsupport/knowledge-center/json-duplicate-key-error-athena-config/

https://stackoverflow.com/questions/53922517/duplicate-keys-with-amazon-athena-and-open-jsonx-serde

AWS
支持工程师
已回答 1 年前
AWS
专家
已审核 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则