1 Answer
- Newest
- Most votes
- Most comments
2
Can you:
- Verify that the VACUUM command is executing successfully after each MERGE operation?
- Verify that it's actually deleting old snapshots and associated data files?
- Look at the size and nature of the updates you're making every hour? If each update is adding a significant amount of data, this could contribute to the rapid growth of the table.
Thank you very much for your suggestion.
- Yep. It is successful every time.
SELECT * FROM "my_table$snapshots
query returns a single row with following content.
# committed_at snapshot_id parent_id operation manifest_list summary 1 2024-03-29 13:01:39.498 UTC 8024726289693964937 4447953933496454465 overwrite s3://my_backet/my_table/metadata/snap-8024726289693964937-1-977f4853-ae88-429a-9189-10fd07a538c3.avro {added-position-deletes=4, total-equality-deletes=0, trino_query_id=20240329_130135_00187_c22yh, added-position-delete-files=1, added-delete-files=1, total-records=1369786, changed-partition-count=1, total-position-deletes=2023891, added-files-size=1627, total-delete-files=2210, total-files-size=6475502851, total-data-files=308}
SELECT * FROM "my_table$files";
returns 308 rows. Unfortunately I don't know how to check if any of this files associated with an old snapshot.- It is not the case. To put it simple each merge changes value of an Integer column keeping total number of rows at the same level.
Could you verify if versioning is enabled on your bucket? If so, this might be contributing to the increase in size. You can check more about Versioning workflows.
Bucket Versioning is disabled.
Relevant content
- asked 9 months ago
- asked 2 months ago
- AWS OFFICIALUpdated 7 months ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated 3 months ago
- AWS OFFICIALUpdated a month ago
Updated with additional information.