Athena Iceberg creates 100,000 files where just a few dozen were expected

0

I have an iceberg table defined like this:

	CREATE TABLE IF NOT EXISTS staging (
	  id STRING,
	  staging_timestamp BIGINT,
              ... blah blah blah ...
	)
	PARTITIONED BY (bucket(24, id))
	LOCATION 's3://%s/%s/staging/'
	TBLPROPERTIES ( 
	  'table_type' ='ICEBERG', 
	  'optimize_rewrite_data_file_threshold' = '1',
	  'vacuum_max_snapshot_age_seconds' = '3600'
	);

I expected the number of files in S3 would stay around 24, especially after OPTIMIZE and VACUUM. However, after a few days I found 100,000 files on S3. VACUUM would time-out. OPTIMIZE didn't seem to remove any files.

What am I doing wrong?

AlexR
已提问 2 个月前170 查看次数
没有答案

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则