Athena Iceberg creates 100,000 files where just a few dozen were expected

0

I have an iceberg table defined like this:

	CREATE TABLE IF NOT EXISTS staging (
	  id STRING,
	  staging_timestamp BIGINT,
              ... blah blah blah ...
	)
	PARTITIONED BY (bucket(24, id))
	LOCATION 's3://%s/%s/staging/'
	TBLPROPERTIES ( 
	  'table_type' ='ICEBERG', 
	  'optimize_rewrite_data_file_threshold' = '1',
	  'vacuum_max_snapshot_age_seconds' = '3600'
	);

I expected the number of files in S3 would stay around 24, especially after OPTIMIZE and VACUUM. However, after a few days I found 100,000 files on S3. VACUUM would time-out. OPTIMIZE didn't seem to remove any files.

What am I doing wrong?

AlexR
質問済み 2ヶ月前170ビュー
回答なし

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ