Redshift Disk space Sudden increase after adding the reserve nodes


We had four nodes (640GB) and discovered that all of our disk space has been consumed. There were, however, two reserve nodes (each 160GB). So we resized the cluster using "Classic resize" to take advantage of the reserve nodes. Now I have 960GB of disk space. Previously, we only used 640GB, and our data is nearly the same size. However, after using the reserve nodes again, the disk storage used remains at 95% and sometimes 100%.

What could be reason ? is there any issue with classic resize

preguntada hace 6 meses262 visualizaciones
1 Respuesta

With a classic resize you can change the node type, number of nodes, or both, in a similar manner to elastic resize. There are somethings to look out for after you do a classic resize.

Sorting and distribution operations that result from classic resize to RA3

During classic resize to RA3, tables with KEY distribution that are migrated as EVEN distribution are converted back to their original distribution style.

After the cluster is fully resized, the following sort behavior occurs:

  • If the resize results in the cluster having more slices, KEY distribution tables become partially unsorted, but EVEN tables remain sorted. Additionally, the information about how much data is sorted may not be up to date, directly following the resize. After key recovery, automatic vacuum sorts the table over time.

  • If the resize results in the cluster having fewer slices, both KEY distribution and EVEN distribution tables become partially unsorted. Automatic vacuum sorts the table over time.

Things that can cause disk storage to remain full after a classic resize in Redshift *

  • Active long-running transactions before rows were deleted. This prevents the VACUUM operation from cleaning up deleted rows until the transactions are committed or aborted.
  • Query processing spilling to disk, especially for queries with sorting, aggregations, or joins. Optimizing queries can reduce disk usage.
  • Tables with VARCHAR(MAX) or other columns using high compression ratios. These column types may inflate the on-disk size.
  • Pending maintenance operations like SORTING tables or rebalancing clusters. Waiting for these to complete can free up disk space.
  • Cartesian joins producing very large intermediate result sets. Rewriting queries to avoid cross joins may help.
  • Query processing spilling to disk, especially for queries with sorting, aggregations, or joins. Optimizing queries can reduce disk usage.

Here is something you can do

  1. Run a VACUUM manually after the resize to reclaim space from deleted rows and sorting tables.For more information about Table Vacuums
  2. Review the table's distribution style, distribution key, and sort key selection. Tables with distribution skew—where more data is located in one node than in the others—can cause a full disk node. If you have tables with skewed distribution styles, then change the distribution style to a more uniform distribution.Changing the distribution style to ALL for smaller tables can also help reclaim space.
respondido hace 6 meses

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas