Athena Iceberg creates 100,000 files where just a few dozen were expected

0

I have an iceberg table defined like this:

	CREATE TABLE IF NOT EXISTS staging (
	  id STRING,
	  staging_timestamp BIGINT,
              ... blah blah blah ...
	)
	PARTITIONED BY (bucket(24, id))
	LOCATION 's3://%s/%s/staging/'
	TBLPROPERTIES ( 
	  'table_type' ='ICEBERG', 
	  'optimize_rewrite_data_file_threshold' = '1',
	  'vacuum_max_snapshot_age_seconds' = '3600'
	);

I expected the number of files in S3 would stay around 24, especially after OPTIMIZE and VACUUM. However, after a few days I found 100,000 files on S3. VACUUM would time-out. OPTIMIZE didn't seem to remove any files.

What am I doing wrong?

AlexR
已提問 2 個月前檢視次數 171 次
沒有答案

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南