I have a partitioned table like this year=2024/month=02/day=25 , will using a where clause like Date(Year-month-hour ) > Date('2024-02-20') reduce data scanned by using partitions ?

0

I have a partitioned table in athena based on

I query the data with this where clause DATE(year || '-' || month || '-' || day) >= from_iso8601_timestamp('2024-03-18T00:00:00')

does it still make use of partitioning to reduce the data scanned

KG
已提问 1 个月前470 查看次数
2 回答
0

For partition pruning to work, you would have to use a where like: WHERE year='2024' and month='02' and day='25'

AWS
Don_D
已回答 1 个月前
profile picture
专家
已审核 1 个月前
0

Adding to the above answer, you can consider enabling partition projection and use date format for year, month and day partitions. https://docs.aws.amazon.com/athena/latest/ug/partition-projection-supported-types.html

To enable partition projection, you can define the partitioning column and table properties as shown below -

CREATE EXTERNAL TABLE tableName(
col1 string,
col2 int,
.....)
PARTITIONED BY (
   `timestamp` string)
LOCATION
	's3://bucket/prefix/'
TBLPROPERTIES (
  'projection.enabled'='true', 
  'projection.timestamp.format'='yyyy-MM-dd', 
  'projection.timestamp.interval'='1', 
  'projection.timestamp.interval.unit'='DAYS', 
  'projection.timestamp.range'='2023-01-01,NOW', 
  'projection.timestamp.type'='date', 
  'storage.location.template'='s3://bucket/prefix/${timestamp}')

With this you can use following in where clause timestamp >= '2024-03-18'. In this case you can pass date as string and need not use 'from_iso8601_timestamp' function.

AWS
已回答 1 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则