I have an iceberg table defined like this:
CREATE TABLE IF NOT EXISTS staging (
id STRING,
staging_timestamp BIGINT,
... blah blah blah ...
)
PARTITIONED BY (bucket(24, id))
LOCATION 's3://%s/%s/staging/'
TBLPROPERTIES (
'table_type' ='ICEBERG',
'optimize_rewrite_data_file_threshold' = '1',
'vacuum_max_snapshot_age_seconds' = '3600'
);
I expected the number of files in S3 would stay around 24, especially after OPTIMIZE and VACUUM.
However, after a few days I found 100,000 files on S3. VACUUM would time-out. OPTIMIZE didn't seem to remove any files.
What am I doing wrong?