Skip to content

Why isn't the Amazon Redshift VACUUM operation reclaiming disk space?

3 minute read
0

A VACUUM operation completes successfully on an Amazon Redshift table. However, disk usage remains high.

Short description

When you delete table rows, a hidden metadata identity column (DELETE_XID) is marked with the transaction ID that deleted the row.

If there is an active, long-running transaction that began before the deletion, then VACUUM can't clean up the rows, and disk space can't be reclaimed.

For more information about the DELETE_XID column, see Optimizing storage for narrow tables.

Resolution

To check for long-running transactions on the Amazon Redshift cluster, run the following query:

rsdb=# select *,datediff(s,txn_start,getdate())/86400||' days '||datediff(s,txn_start,getdate())%86400/3600||' hrs '||datediff(s,txn_start,getdate())%3600/60||' mins '||datediff(s,txn_start,getdate())%60||' secs' duration from svv_transactions where lockable_object_type='transactionid' and  pid<>pg_backend_pid() order by 3;

The following output shows that xid 50341 has been active for 19 minutes and 37 seconds:

txn_owner  | txn_db |  xid  |  pid  |         txn_start          |   lock_mode   | lockable_object_type | relation | granted |           duration
-----------+--------+-------+-------+----------------------------+---------------+----------------------+----------+---------+------------------------------
 superuser | rsdb   | 50341 | 21612 | 2019-08-19 20:20:33.147622 | ExclusiveLock | transactionid        |          | t       | 0 days 0 hrs 19 mins 37 secs
(1 row)

To confirm that you deleted rows from the Amazon Redshift table, run the following query:

select a.query, a.xid, trim(c.name) tablename, b.deleted_rows, a.starttime, a.endtime
from stl_query a 
join (select query, tbl, sum(rows) deleted_rows from stl_delete group by 1,2) b 
on a.query = b.query
join (select id, name from stv_tbl_perm group by 1,2) c 
on c.id = b.tbl 
where a.xid in (select distinct xid from stl_commit_stats)
and trim(c.name) = 'tablename'
order by a.starttime;

The following output shows that the transaction marked for row deletion (xid 50350) started after the long-running transaction (xid 50341):

query  |  xid  | tablename | deleted_rows |         starttime          |          endtime
-------+-------+-----------+--------------+----------------------------+----------------------------
 18026 | 50350 | test      |            5 | 2019-08-19 20:20:48.137594 | 2019-08-19 20:20:50.125609
(1 rows)

To allow VACUUM DELETE to reclaim these deleted rows, choose one of the following options:

  • Wait for the long-running transaction to complete.
  • Use the PG_TERMINATE_BACKEND statement to terminate the session that holds the long-running transaction.

After you complete the preceding action, rerun the VACUUM operation.

Investigate the long-running transactions

To check the activity in a long-running transaction, query the SVL_STATEMENTTEXT view:

rsdb=# select pid, xid, trim(label), starttime, endtime, trim(text) from svl_statementtext where xid = 50341 order by starttime , sequence;

Example output:

pid  |  xid  |  btrim  |         starttime          |          endtime           |          btrim
-------+-------+---------+----------------------------+----------------------------+--------------------------
 21612 | 50341 | default | 2019-08-19 20:20:31.733843 | 2019-08-19 20:20:31.733844 | begin;
 21612 | 50341 | default | 2019-08-19 20:20:33.146937 | 2019-08-19 20:20:35.020556 | select * from sometable;
(2 rows)

To check that queries run in the transaction, query the STV_INFLIGHT view:

rsdb=# select query, xid, pid, starttime, trim(text) from stv_inflight where xid = 50341;

Example output:

query | xid | pid | starttime | btrim
-------+-----+-----+-----------+-------
(0 rows)

Long-running transaction common causes

The following behavior can result in long-running transactions:

  • A user starts an implicit transaction from a client where automatic commit is turned off. The transaction remains active until the user closes the transaction with the COMMIT or ROLLBACK command, or until the session is terminated.
  • A user starts a transaction and uses BEGIN, but never closes the transaction with the COMMIT or ROLLBACK command.

Related information

Vacuuming tables

Minimizing vacuum times

AWS OFFICIALUpdated 7 months ago