I have a partitioned table like this year=2024/month=02/day=25 , will using a where clause like Date(Year-month-hour ) > Date('2024-02-20') reduce data scanned by using partitions ?

0

I have a partitioned table in athena based on

I query the data with this where clause DATE(year || '-' || month || '-' || day) >= from_iso8601_timestamp('2024-03-18T00:00:00')

does it still make use of partitioning to reduce the data scanned

KG
gefragt vor einem Monat470 Aufrufe
2 Antworten
0

For partition pruning to work, you would have to use a where like: WHERE year='2024' and month='02' and day='25'

AWS
Don_D
beantwortet vor einem Monat
profile picture
EXPERTE
überprüft vor einem Monat
0

Adding to the above answer, you can consider enabling partition projection and use date format for year, month and day partitions. https://docs.aws.amazon.com/athena/latest/ug/partition-projection-supported-types.html

To enable partition projection, you can define the partitioning column and table properties as shown below -

CREATE EXTERNAL TABLE tableName(
col1 string,
col2 int,
.....)
PARTITIONED BY (
   `timestamp` string)
LOCATION
	's3://bucket/prefix/'
TBLPROPERTIES (
  'projection.enabled'='true', 
  'projection.timestamp.format'='yyyy-MM-dd', 
  'projection.timestamp.interval'='1', 
  'projection.timestamp.interval.unit'='DAYS', 
  'projection.timestamp.range'='2023-01-01,NOW', 
  'projection.timestamp.type'='date', 
  'storage.location.template'='s3://bucket/prefix/${timestamp}')

With this you can use following in where clause timestamp >= '2024-03-18'. In this case you can pass date as string and need not use 'from_iso8601_timestamp' function.

AWS
beantwortet vor einem Monat

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen