Help us improve the AWS re:Post Knowledge Center by sharing your feedback in a brief survey. Your input can influence how we create and update our content to better support your AWS journey.
How do I troubleshoot issues with VACUUM in Amazon Redshift?
My VACUUM queries fail in my Amazon Redshift cluster.
Short description
VACUUM is a resource-intensive operation that might slow down because the following occurs:
- A high percentage of unsorted data
- A large table with too many columns
- The use of an interleaved sort key
- Irregular or infrequent use of VACUUM
- Concurrent tables, cluster queries, (Data Definition Language (DDL) statements, or extract, transform, load (ETL) jobs
Use the svv_vacuum_progress query to check the status and details of your VACUUM operation.
Resolution
Troubleshoot VACUUM performance
Note: The following applies to provisioned Amazon Redshift clusters. The following system tables and queries don't work on Amazon Redshift Serverless.
To check if the VACUUM operation is in progress, run the following SVV_VACUUM_PROGRESS query:
dev=# SELECT * FROM svv_vacuum_progress; table_name | status | time_remaining_estimate -----------+---------------------------------+------------------------- data8 | Vacuum: initialize merge data8 | 4m 55s (1 row)
The SVV_VACUUM_PROGRESS query also includes the name of the table, the VACUUM's status, and the estimated time that remains until it completes. If the VACUUM isn't running, then the SVV_VACUUM_PROGRESS query shows the status of the last run VACUUM. The SVV_VACUUM_PROGRESS query returns only one row of results.
To check the details of the table that's being vacuumed, run the following query:
SELECT schema, table_id, "table", diststyle, sortkey1, sortkey_num, unsorted, tbl_rows, estimated_visible_rows, stats_off FROM svv_table_info WHERE "table" IN ('data8');
Note: Replace table with your table name and data8 with your schema name.
Example output:
Schema | table_id | table | diststyle | sortkey1 | sortkey_num | unsorted | tbl_rows | est_visible_rows | stats_off ------------+----------+-------+-----------+----------+-------------+----------+-----------+------------------+----------- testschema | 977719 | data8 | EVEN | order_id | 2 | 25.00 | 755171520 | 566378624 | 100.00
From the preceding output, the sortkey1 column shows the main sort key.
If the sortkey1 column shows INTERLEAVED, then the table has an interleaved sort key.
The sortkey_num column shows the number of columns in the sort key.
The unsorted column shows the percentage of rows that need to be sorted.
Thetbl_rowscolumn shows the total number of rows, including the deleted and updated rows.
The estimated_visible_rows is the number of rows that excludes the deleted rows.
After a complete vacuum (delete and sort), the value for tbl_rows and estimated_visible_rows resemble each other, and the unsorted value reaches 0.
Note: Data in the table updates in real time. To check the progress of VACUUM, continue to run the query. The unsorted rows gradually decrease as VACUUM progresses. To confirm whether you have a high percentage of unsorted data, check the VACUUM information for a specific table.
Run the following query to check VACUUM information for a table.
SELECT table_id, status, rows, sortedrows, blocks, eventtime FROM stl_vacuum WHERE table_id=977719 ORDER BY eventtime DESC LIMIT 20;
Note: Replace 97771 with the table ID from the previous query.
Example output:
table_id | status | rows | sortedrows | blocks | eventtime ----------+--------------------------------+------------+------------+--------+---------------------------- 977719 | [VacuumBG] Finished | 566378640 | 0 | 23618 | 2020-05-27 06:55:33.232536 977719 | [VacuumBG] Started Delete Only | 1132757280 | 566378640 | 47164 | 2020-05-27 06:55:18.906008 977719 | Finished | 566378640 | 566378640 | 23654 | 2020-05-27 06:46:04.086842 977719 | Started | 1132757280 | 566378640 | 45642 | 2020-05-27 06:28:17.128345 (4 rows)
In the preceding example, the output lists the latest events first, and then older events, in sorted order:
The last VACUUM was an automatic VACUUM DELETE that started at 2020-05-27 06:55:18.906008 UTC and completed in a few seconds.
This VACUUM released the space that deleted rows occupied. You can compare the changes in the number of blocks the table occupied from the start and completion of VACUUM.
Note: Amazon Redshift automatically performs VACUUM SORT and VACUUM DELETE operations on tables in the background. These background VACUUMs run during periods of reduced loads and are paused during periods of high load. This automatic VACUUM reduces the need to run the VACUUM command.
The sortedrows column shows the number of sorted rows in the table. In the last VACUUM, no sort was done because it was an automatic VACUUM DELETE operation. Because the active rows weren't sorted, the row marked for deletion shows the same number of sorted rows from when VACUUM started. After VACUUM DELETE completes, you see 0 sorted rows.
The initial VACUUM that started at 2020-05-27 06:28:17.128345 UTC shows a full VACUUM. The process released the space from deleted rows and sorted rows after about 18 minutes. When the VACUUM operation completed, the output shows the same values for the rows and sortedrows columns because the VACUUM successfully sorted the rows.
For a VACUUM that's already in progress, continue to monitor its performance and incorporate best practices.
Troubleshoot VACUUM failure
Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.
To find out why a VACUUM query failed, use either SYS_QUERY_HISTORY or STL_QUERY to check for error messages. If you use the STL_QUERY, then you must get the error details from STL_ERROR. Because STL_ERROR doesn't have a query ID column, find the PID field from the STL_QUERY. Then, use that field in the STL_ERROR query.
Example SYS_QUERY_HISTORY:
SELECT user_id, query_id, transaction_id, session_id, status, start_time, end_time, execution_time, error_message FROM sys_query_history WHERE query_id IN (<failed queries>) +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ | user_id | query_id | transaction_id | session_id | status | start_time | end_time | execution_time | error_message | | 100 | 915082632 | 35599398 | 1096641177 | failed | 2024-10-06 21:09:30.209587 | 2024-10
If you use the execute statement to run the VACUUM query, then use describe-statement AWS CLI command to identify error messages.
Example describe-statement:
aws redshift-data describe-statement --id 7c823348d-be8b-437a-9a0-db8c0ca44f0f { "ClusterIdentifier": "redshift-cluster-1", "CreatedAt": "2024-10-07T16:25:27.566000+00:00", "Duration": -1, "Error": "ERROR: VACUUM cannot run inside a multiple commands statement", "HasResultSet": false, "Id": "7c823348d-be8b-437a-9a0-db8c0ca44f0f", "QueryString": "vacuum full toptem;\nvacuum full tsupport;\nvacuum full supplierxbox;\nvacuum full party;", "RedshiftPid": 10723479554, "RedshiftQueryId": 42304, "ResultRows": -1, "ResultSize": -1, "Status": "FAILED", "UpdatedAt": "2024-10-07T16:25:33.566000+00:00" }
If the cluster is fully idle, then run a MANUAL VACUUM for failed VACUUM attempts. For more information, see Vacuuming and analyzing tables manually.
Use VACUUM best practices
You can improve VACUUM performance with the following best practices.
Because VACUUM is a resource-intensive operation, run it during off-peak hours.
Use wlm_query_slot_count to temporarily override the concurrency level in a queue for a VACUUM operation.
Run the VACUUM operation with a threshold parameter of up to 99% for large tables. Determine the appropriate threshold and frequency of running VACUUM. For example, you might want to run VACUUM at a threshold of 100%, or have your data always sorted. Use the approach that optimizes your Amazon Redshift cluster's query performance.
Run a VACUUM FULL or VACUUM SORT ONLY often enough that a high unsorted AWS Region doesn't accumulate in large tables.
If there is a large amount of unsorted data on a large table, then perform a deep copy.
Run the VACUUM command with the BOOST option.
Divide large tables into time-series tables to improve VACUUM performance. In some cases, when you use a time-series table, you can fulfill the need to run VACUUM.
Choose a column compression type for large tables. Compressed rows consume less disk space when you sort data.
Use the ANALYZE command after VACUUM operation to update the statistics. The query planner uses these values to choose the best plans.
Related information
How do I troubleshoot a failed or canceled query in Amazon Redshift?
Why is my Amazon Redshift Serverless query canceled or stopped?
- Topics
- Analytics
- Tags
- Amazon Redshift
- Language
- English

This article was reviewed and updated on 2026-04-06.
Relevant content
- asked 2 years ago
- asked 2 years ago
- Accepted Answerasked 6 years ago
AWS OFFICIALUpdated 10 months ago