How to define timestamp without mls in Athena table

0

Hi,

I'm trying to query some data in parquet files in Athena, but I have two columns that are timestamp but they don't have mls. The date format is in varchar like: '2023-08-09 15:41:42'. I'm trying to create an Athena table but I don't want to set up this column as a varchar, I want to read it directly as a timestamp. Any idea how can i change the default timestamp format? I've tried adding this in the CREATE EXTERNAL TABLE:

WITH SERDEPROPERTIES (
  'timestampFormat' = 'YYYY-MM-DD HH:MM:SS' 
)

But I still get this error:

TYPE_MISMATCH: Unable to read parquet data. This is most likely caused by a mismatch between the parquet and metastore schema
This query ran against the "trial_eth_db" database, unless qualified by the query. Please post the error message on our forum  or contact customer support  with Query Id: 94c1dba3-2155-476c-8672-e8055863c1da

Thanks a lot in advance.

Sara
asked 8 months ago737 views
1 Answer
1

Hello Sara,

In Athena, when you're dealing with timestamp data in Parquet files that don't include milliseconds (ms), you can still read and convert these columns into timestamp data types. However, you need to handle the timestamp format during the query, as Parquet itself does not store timestamp formats.

You can use the CAST and FROM_ISO8601_TIMESTAMP functions to convert the varchar timestamp data into a timestamp data type during your query. Here's how you can do it:

Assuming you have a table with the column timestamp_column containing varchar timestamp data ('2023-08-09 15:41:42'), you can run a query like this to convert it to a timestamp:

SELECT
  CAST(FROM_ISO8601_TIMESTAMP(timestamp_column) AS timestamp) AS timestamp_column
FROM
  your_table_name;

This query does the following:

  1. FROM_ISO8601_TIMESTAMP(timestamp_column) converts the varchar timestamp to a timestamp with milliseconds.
  2. CAST(... AS timestamp) casts the result back to a timestamp without milliseconds.

Replace your_table_name with the actual name of your Athena table.

This way, you can read and work with your timestamp data as timestamps without milliseconds directly in your queries without the need to modify the table's schema or format settings.

Please give a thumbs up if this suggestion helps

profile picture
answered 8 months ago
  • Parquet do hold timestamp or datetime values, they just need to be written with the proper schema.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions