Upgrading from PostgreSQL engine 11.19 to 15.2 an instance became stuck restarting over and over. It has done so for 24 hours.
Snippet from one of the logs:
2023-08-22 15:04:35 UTC::@:[395]:LOG: checkpoint starting: time
2023-08-22 15:04:35 UTC::@:[395]:LOG: checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.417 s, sync=0.002 s, total=0.430 s; sync files=3, longest=0.001 s, average=0.001 s; distance=0 kB, estimate=0 kB
2023-08-22 15:09:35 UTC::@:[395]:LOG: checkpoint starting: time
2023-08-22 15:09:36 UTC::@:[395]:LOG: checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.429 s, sync=0.002 s, total=0.447 s; sync files=2, longest=0.001 s, average=0.001 s; distance=65536 kB, estimate=65536 kB
2023-08-22 15:09:38 UTC::@:[393]:LOG: received fast shutdown request
2023-08-22 15:09:38 UTC::@:[393]:LOG: aborting any active transactions
2023-08-22 15:09:38 UTC::@:[393]:LOG: background worker "logical replication launcher" (PID 401) exited with exit code 1
2023-08-22 15:09:38 UTC::@:[395]:LOG: shutting down
2023-08-22 15:09:39 UTC::@:[395]:LOG: checkpoint starting: shutdown immediate
2023-08-22 15:09:39 UTC::@:[395]:LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.016 s, sync=0.001 s, total=0.033 s; sync files=0, longest=0.000 s, average=0.000 s; distance=131071 kB, estimate=131071 kB
2023-08-22 15:09:39 UTC::@:[393]:LOG: database system is shut down
2023-08-22 15:09:43 UTC::@:[393]:LOG: starting PostgreSQL 15.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12), 64-bit
2023-08-22 15:09:43 UTC::@:[393]:LOG: listening on IPv4 address "127.0.0.1", port 5432
2023-08-22 15:09:43 UTC::@:[393]:LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2023-08-22 15:09:43 UTC::@:[397]:LOG: database system was shut down at 2023-08-22 15:09:39 UTC
2023-08-22 15:09:43 UTC::@:[393]:LOG: database system is ready to accept connections
2023-08-22 15:14:43 UTC::@:[395]:LOG: checkpoint starting: time
2023-08-22 15:14:44 UTC::@:[395]:LOG: checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.417 s, sync=0.002 s, total=0.429 s; sync files=3, longest=0.001 s, average=0.001 s; distance=0 kB, estimate=0 kB
2023-08-22 15:19:41 UTC::@:[393]:LOG: received fast shutdown request
2023-08-22 15:19:41 UTC::@:[393]:LOG: aborting any active transactions
2023-08-22 15:19:41 UTC::@:[393]:LOG: background worker "logical replication launcher" (PID 401) exited with exit code 1
2023-08-22 15:19:41 UTC::@:[395]:LOG: shutting down
2023-08-22 15:19:41 UTC::@:[395]:LOG: checkpoint starting: shutdown immediate
2023-08-22 15:19:41 UTC::@:[395]:LOG: checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.010 s, sync=0.003 s, total=0.036 s; sync files=2, longest=0.003 s, average=0.002 s; distance=131071 kB, estimate=131071 kB
2023-08-22 15:19:41 UTC::@:[393]:LOG: database system is shut down
After 6 hours I created a new instance from the snapshot I created before the upgrade (and good thing I did create one, because the one that should have been created as part of the the upgrade is still stuck in the state "creating"), and configured our server to point to the new instance. Now I just need to stop the old instance.
However, I can't stop the old instance. the AWS-web-gui has grayed out the option while upgrading, and when trying to do it through the AWS-console I get "An error occurred (InvalidDBInstanceState) when calling the StopDBInstance operation: Instance products-webserver-db is not in available state."
Is there any way to force a stop, or do I really have to delete the RDS-instance?
As I said I already tried that: [...] and when trying to do it through the AWS-console I get "An error occurred (InvalidDBInstanceState) when calling the StopDBInstance operation: Instance products-webserver-db is not in available state."