How to define timestamp without mls in Athena table

0

Hi,

I'm trying to query some data in parquet files in Athena, but I have two columns that are timestamp but they don't have mls. The date format is in varchar like: '2023-08-09 15:41:42'. I'm trying to create an Athena table but I don't want to set up this column as a varchar, I want to read it directly as a timestamp. Any idea how can i change the default timestamp format? I've tried adding this in the CREATE EXTERNAL TABLE:

WITH SERDEPROPERTIES (
  'timestampFormat' = 'YYYY-MM-DD HH:MM:SS' 
)

But I still get this error:

TYPE_MISMATCH: Unable to read parquet data. This is most likely caused by a mismatch between the parquet and metastore schema
This query ran against the "trial_eth_db" database, unless qualified by the query. Please post the error message on our forum  or contact customer support  with Query Id: 94c1dba3-2155-476c-8672-e8055863c1da

Thanks a lot in advance.

Sara
已提问 8 个月前768 查看次数
1 回答
1

Hello Sara,

In Athena, when you're dealing with timestamp data in Parquet files that don't include milliseconds (ms), you can still read and convert these columns into timestamp data types. However, you need to handle the timestamp format during the query, as Parquet itself does not store timestamp formats.

You can use the CAST and FROM_ISO8601_TIMESTAMP functions to convert the varchar timestamp data into a timestamp data type during your query. Here's how you can do it:

Assuming you have a table with the column timestamp_column containing varchar timestamp data ('2023-08-09 15:41:42'), you can run a query like this to convert it to a timestamp:

SELECT
  CAST(FROM_ISO8601_TIMESTAMP(timestamp_column) AS timestamp) AS timestamp_column
FROM
  your_table_name;

This query does the following:

  1. FROM_ISO8601_TIMESTAMP(timestamp_column) converts the varchar timestamp to a timestamp with milliseconds.
  2. CAST(... AS timestamp) casts the result back to a timestamp without milliseconds.

Replace your_table_name with the actual name of your Athena table.

This way, you can read and work with your timestamp data as timestamps without milliseconds directly in your queries without the need to modify the table's schema or format settings.

Please give a thumbs up if this suggestion helps

profile picture
已回答 8 个月前
  • Parquet do hold timestamp or datetime values, they just need to be written with the proper schema.

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则

相关内容