I have a partitioned table like this year=2024/month=02/day=25 , will using a where clause like Date(Year-month-hour ) > Date('2024-02-20') reduce data scanned by using partitions ?

0

I have a partitioned table in athena based on

I query the data with this where clause DATE(year || '-' || month || '-' || day) >= from_iso8601_timestamp('2024-03-18T00:00:00')

does it still make use of partitioning to reduce the data scanned

KG
질문됨 한 달 전470회 조회
2개 답변
0

For partition pruning to work, you would have to use a where like: WHERE year='2024' and month='02' and day='25'

AWS
Don_D
답변함 한 달 전
profile picture
전문가
검토됨 한 달 전
0

Adding to the above answer, you can consider enabling partition projection and use date format for year, month and day partitions. https://docs.aws.amazon.com/athena/latest/ug/partition-projection-supported-types.html

To enable partition projection, you can define the partitioning column and table properties as shown below -

CREATE EXTERNAL TABLE tableName(
col1 string,
col2 int,
.....)
PARTITIONED BY (
   `timestamp` string)
LOCATION
	's3://bucket/prefix/'
TBLPROPERTIES (
  'projection.enabled'='true', 
  'projection.timestamp.format'='yyyy-MM-dd', 
  'projection.timestamp.interval'='1', 
  'projection.timestamp.interval.unit'='DAYS', 
  'projection.timestamp.range'='2023-01-01,NOW', 
  'projection.timestamp.type'='date', 
  'storage.location.template'='s3://bucket/prefix/${timestamp}')

With this you can use following in where clause timestamp >= '2024-03-18'. In this case you can pass date as string and need not use 'from_iso8601_timestamp' function.

AWS
답변함 한 달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠