How to define timestamp without mls in Athena table

0

Hi,

I'm trying to query some data in parquet files in Athena, but I have two columns that are timestamp but they don't have mls. The date format is in varchar like: '2023-08-09 15:41:42'. I'm trying to create an Athena table but I don't want to set up this column as a varchar, I want to read it directly as a timestamp. Any idea how can i change the default timestamp format? I've tried adding this in the CREATE EXTERNAL TABLE:

WITH SERDEPROPERTIES (
  'timestampFormat' = 'YYYY-MM-DD HH:MM:SS' 
)

But I still get this error:

TYPE_MISMATCH: Unable to read parquet data. This is most likely caused by a mismatch between the parquet and metastore schema
This query ran against the "trial_eth_db" database, unless qualified by the query. Please post the error message on our forum  or contact customer support  with Query Id: 94c1dba3-2155-476c-8672-e8055863c1da

Thanks a lot in advance.

Sara
質問済み 8ヶ月前768ビュー
1回答
1

Hello Sara,

In Athena, when you're dealing with timestamp data in Parquet files that don't include milliseconds (ms), you can still read and convert these columns into timestamp data types. However, you need to handle the timestamp format during the query, as Parquet itself does not store timestamp formats.

You can use the CAST and FROM_ISO8601_TIMESTAMP functions to convert the varchar timestamp data into a timestamp data type during your query. Here's how you can do it:

Assuming you have a table with the column timestamp_column containing varchar timestamp data ('2023-08-09 15:41:42'), you can run a query like this to convert it to a timestamp:

SELECT
  CAST(FROM_ISO8601_TIMESTAMP(timestamp_column) AS timestamp) AS timestamp_column
FROM
  your_table_name;

This query does the following:

  1. FROM_ISO8601_TIMESTAMP(timestamp_column) converts the varchar timestamp to a timestamp with milliseconds.
  2. CAST(... AS timestamp) casts the result back to a timestamp without milliseconds.

Replace your_table_name with the actual name of your Athena table.

This way, you can read and work with your timestamp data as timestamps without milliseconds directly in your queries without the need to modify the table's schema or format settings.

Please give a thumbs up if this suggestion helps

profile picture
回答済み 8ヶ月前
  • Parquet do hold timestamp or datetime values, they just need to be written with the proper schema.

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ