Skip to content

How do I troubleshoot local storage issues in my Aurora PostgreSQL-Compatible DB instances?

5 minute read
1

I want to troubleshoot temporary tables, log files, and other factors that cause local storage issues in my Amazon Aurora PostgreSQL-Compatible Edition DB instances.

Short Description

There are two types of storage for Amazon Aurora: Storage for persistent data, such as shared cluster volume, and local storage for each DB instance.

The local storage size is bound to the instance class. You can change the storage size only if you move to a larger DB instance class. To store error logs and temporary files, Aurora PostgreSQL-Compatible uses local storage. For non-T class instance types, the system writes logs onto a dedicated volume that the system automatically cleans.

To monitor the local storage space that's associated with the Aurora DB instance or node, use the Amazon CloudWatch metric FreeLocalStorage. This metric reports the amount of storage that's available for the temporary tables in each DB instance.

Resolution

There's no local storage space left

You receive the "could not write block ###### of temporary file: No space left on device" error message when there's no temporary storage left. This error can occur for the following operations:

  • Alter large tables
  • Add indexes on large tables
  • Perform large SELECT queries with complex JOINs, GROUP BY, or ORDER BY clauses.

Check the temporary file size

To identify the details of the temporary file, turn on the log_temp_files parameter in the Aurora PostgreSQL-Compatible DB instance. Then, compare the temporary files to the FreeLocalStorage metric.

The log_temp_files parameter logs each temporary file that's larger than the number of specified kilobytes. A value of 0 logs all temporary file information. A positive value logs only the files that are larger than or equal to the specified number of kilobytes. The default value is -1, which turns off temporary file logging. When you turn on the log_temp_files parameter, it can cause excessive logging on the Aurora PostgreSQL-Compatible DB instance.

It's a best practice to check the size of the Aurora PostgreSQL-Compatible log files before you turn on log_temp_files. If the log files consume the maximum space for the local storage, then reduce the value of rds.log_retention to reclaim space. The default value for rds.log_retention is 3 days.

To review the temporary file size, run the following command multiple times:

maxiops=> select datname, temp_files , pg_size_pretty(temp_bytes) as temp_file_size  FROM   pg_stat_database order by temp_bytes desc;

Note: Use the output that changed after subsequent runs.

In the temp_files column, all temporary files are counted regardless of when you created the temporary file. The columns temp_files and temp_bytes in view pg_stat_database collect statistics for the accumulated value. To reset the value, use the pg_stat_reset() function or restart the DB instance. For more information, see Additional statistics functions on the PostgreSQL website.

If you use Aurora PostgreSQL-Compatible 10 or later, then you can monitor temp_bytes and temp_files with Performance Insights. Performance Insights provide native counters for your DB engine's internal metrics and wait events. You can also turn on Performance Insights and use Database Insights to monitor the database logs.

To allocate more memory to the processes that perform the operation, you can increase the maintenance_work_mem and work_mem parameters. This method uses more memory for the operation, and less temporary disk storage. For more information about these parameters, see maintenance_work_mem and work_mem on the PostgreSQL website.

It's a best practice to set the values for maintenance_work_mem and work_mem at a query or session level. Then, your DB instances don't run out of memory. For more information, see Amazon Aurora PostgreSQL reference.

Check the temporary table size

Run the following query:

maxiops=> SELECT n.nspname as SchemaName  
,c.relname as RelationName  
,CASE c.relkind  
WHEN 'r' THEN 'table'  
WHEN 'v' THEN 'view'  
WHEN 'i' THEN 'index'  
WHEN 'S' THEN 'sequence'  
WHEN 's' THEN 'special'  
END as RelationType  
,pg_catalog.pg_get_userbyid(c.relowner) as RelationOwner  
,pg_size_pretty(pg_relation_size(n.nspname ||'.'|| c.relname)) as RelationSize  
FROM pg_catalog.pg_class c  
LEFT JOIN pg_catalog.pg_namespace n  
    ON n.oid = c.relnamespace  
WHERE  c.relkind IN ('r','s')  
AND  (n.nspname !~ '^pg_toast' and nspname like 'pg_temp%')  
ORDER BY pg_relation_size(n.nspname ||'.'|| c.relname) DESC;

It's a best practice to closely monitor your application and see the transactions that create temporary tables. Then, you can manage the usage of the available local storage capacity. You can also change the instance class for your Aurora DB instance so that the DB instance has more available local storage.

Log files (applicable only for burstable-performance instance class types) that use local storage

Excessive logging can cause your DB instance to run out of local storage for only burstable-performance instance class types such as db.t2, db.t3, and db.t4g. The following are examples of logging parameters that can consume local storage space. The consumption could occur because of either excessive logging or retaining the error log for a long time.

To identify the parameter that causes the excessive logging, analyze the PostgreSQL logs to find the largest logs. Then, identify the parameter that's responsible for the majority of the entries in those logs. You can then modify the parameter that causes the excessive logging.

Example output:

rds.log_retention_period  
auto_explain.log_min_duration  
log_connections  
log_disconnections  
log_lock_waits  
log_min_duration_statement  
log_statement  
log_statement_stats

If you repeatedly run a query that fails with an error, then PostgreSQL logs the errors to the PostgreSQL error log by default. To prevent the error logs from using excessive storage, review the errors in the log, and then fix the failing query. You can also reduce the 3-day default value for rds.log_retention to reclaim the space that the error logs use.

It's a best practice to change your instance class to a larger instance class to avoid excessive logging. This way, your Aurora DB instance has more available local storage.

Related information

Best practices with Amazon Aurora PostgreSQL

AWS OFFICIALUpdated 2 months ago
1 Comment

I believe that the query in "Check temporary tables" section will only return data if run by a super user.

Also, that section does not mention that the config "temp_buffers" will control how large temp tables can be and remain in memory and NOT spill to disk as temp files.

replied 6 months ago