Athena Iceberg creates 100,000 files where just a few dozen were expected

0

I have an iceberg table defined like this:

	CREATE TABLE IF NOT EXISTS staging (
	  id STRING,
	  staging_timestamp BIGINT,
              ... blah blah blah ...
	)
	PARTITIONED BY (bucket(24, id))
	LOCATION 's3://%s/%s/staging/'
	TBLPROPERTIES ( 
	  'table_type' ='ICEBERG', 
	  'optimize_rewrite_data_file_threshold' = '1',
	  'vacuum_max_snapshot_age_seconds' = '3600'
	);

I expected the number of files in S3 would stay around 24, especially after OPTIMIZE and VACUUM. However, after a few days I found 100,000 files on S3. VACUUM would time-out. OPTIMIZE didn't seem to remove any files.

What am I doing wrong?

AlexR
질문됨 2달 전170회 조회
답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠