Athena/Glue Timestamp Format in Microseconds

0

Hello,

I would like to change the timestamp representation in Athena and Glue to Microseconds. We crawl the data via Glue and store the in Glue tables and we try to query the data via Athena. By default the timestamp is expressed in Milliseconds and therefore the fractional seconds are truncated from the data in our parquet-files. However, we do need the entire information. Here is the example

Athena/Glue: YYYY-MM-DD hh:mm:ss.fff Real Data: YYYY-MM-DD hh:mm:ss.ffffff

Is there a possibility to get the entire fractional seconds? Retrieving the lost fractional seconds via function in Athena just fills up the seconds with zeros. :(

Thanks in advance Best regards, Susa

Susa
질문됨 일 년 전2221회 조회
2개 답변
2
수락된 답변

As far as I know, both AWS Glue and Amazon Athena do not natively support timestamp precision beyond milliseconds, i.e., they do not support microseconds or nanoseconds.

Timestamps in AWS Glue and Athena are represented in the YYYY-MM-DD HH:MI:SS.fff format (down to milliseconds precision) as you've noted. This is a limitation of the current timestamp data type in these services. You can read more about the supported data types in the official documentation: Data types in Amazon Athena

There are a couple of possible workarounds, but they are not perfect and may require additional data manipulation:

  1. String Field: Store your timestamps as a string field with your desired precision in your Parquet files. You can then convert or manipulate these strings in your Athena queries as needed.

    You could use the string format YYYY-MM-DD hh:mm:ss.ffffff and cast it as a timestamp when necessary in your Athena queries:

    SELECT DATE_PARSE(your_timestamp_string, '%Y-%m-%d %H:%i:%s.%f') 
    FROM your_table

    Keep in mind that this will still truncate your timestamp to milliseconds when casting it as a timestamp.

  2. Separate Fields: Another approach is to separate the date and time components and the fractional seconds into separate fields. This would allow you to maintain precision for your fractional seconds in a separate numeric field, although it would complicate date/time operations.

If this is a crucial requirement for your use case, I would recommend reaching out to AWS support or your AWS account manager to discuss possible solutions or to submit a feature request.

If the answer is helpful, please click "Accept Answer" and upvote it.

profile picture
전문가
답변함 일 년 전
profile picture
전문가
검토됨 2달 전
0

Thanks a million for your fast and precise answer. We will go for the first approach since it is the easiest and safest method in my opinion. We will discuss a feature request with our account manager.

Best regards, Susa

Susa
답변함 일 년 전
  • Hi Susa, I’m happy to help. Please don’t forget to click on “Accept Answer” as that helps other members of the community find this article easily on search, and recognizes my effort to help other people with their AWS questions.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠