Athena/Glue Timestamp Format in Microseconds

0

Hello,

I would like to change the timestamp representation in Athena and Glue to Microseconds. We crawl the data via Glue and store the in Glue tables and we try to query the data via Athena. By default the timestamp is expressed in Milliseconds and therefore the fractional seconds are truncated from the data in our parquet-files. However, we do need the entire information. Here is the example

Athena/Glue: YYYY-MM-DD hh:mm:ss.fff Real Data: YYYY-MM-DD hh:mm:ss.ffffff

Is there a possibility to get the entire fractional seconds? Retrieving the lost fractional seconds via function in Athena just fills up the seconds with zeros. :(

Thanks in advance Best regards, Susa

Susa
已提问 1 年前2222 查看次数
2 回答
2
已接受的回答

As far as I know, both AWS Glue and Amazon Athena do not natively support timestamp precision beyond milliseconds, i.e., they do not support microseconds or nanoseconds.

Timestamps in AWS Glue and Athena are represented in the YYYY-MM-DD HH:MI:SS.fff format (down to milliseconds precision) as you've noted. This is a limitation of the current timestamp data type in these services. You can read more about the supported data types in the official documentation: Data types in Amazon Athena

There are a couple of possible workarounds, but they are not perfect and may require additional data manipulation:

  1. String Field: Store your timestamps as a string field with your desired precision in your Parquet files. You can then convert or manipulate these strings in your Athena queries as needed.

    You could use the string format YYYY-MM-DD hh:mm:ss.ffffff and cast it as a timestamp when necessary in your Athena queries:

    SELECT DATE_PARSE(your_timestamp_string, '%Y-%m-%d %H:%i:%s.%f') 
    FROM your_table

    Keep in mind that this will still truncate your timestamp to milliseconds when casting it as a timestamp.

  2. Separate Fields: Another approach is to separate the date and time components and the fractional seconds into separate fields. This would allow you to maintain precision for your fractional seconds in a separate numeric field, although it would complicate date/time operations.

If this is a crucial requirement for your use case, I would recommend reaching out to AWS support or your AWS account manager to discuss possible solutions or to submit a feature request.

If the answer is helpful, please click "Accept Answer" and upvote it.

profile picture
专家
已回答 1 年前
profile picture
专家
已审核 2 个月前
0

Thanks a million for your fast and precise answer. We will go for the first approach since it is the easiest and safest method in my opinion. We will discuss a feature request with our account manager.

Best regards, Susa

Susa
已回答 1 年前
  • Hi Susa, I’m happy to help. Please don’t forget to click on “Accept Answer” as that helps other members of the community find this article easily on search, and recognizes my effort to help other people with their AWS questions.

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则