Changing Transaction Isolation Level on Legacy Redshift Database

0

Reaching out to see if anyone has changed the transaction isolation level on one or more of their production Redshift databases from transaction isolation level serializable isolation to snapshot isolation.

My use of the word "legacy" in the question title means that the database was created prior to the release of snapshot isolation as the default for Redshift database. One relevant AWS announcement is located here.

New Redshift clusters (and their databases) default to snapshot isolation. So, the question relates solely to a legacy database that you changed...or, are thinking of changing the transaction isolation level on. Did you make any observations about data integrity that you tried to relate back to the change in your transaction isolation level? Any gotchas?

Adding that my Redshift cluster is used as an operational database. We have business logic in our ETL jobs that performs transformations for more than just analytics purposes. For example, we calculate certain lifetime metrics and then pass these metrics back to our production, operational databases in reverse ETLs. We have numerous update, delete, and insert statements that transaction with regular frequency.

asked a month ago132 views
1 Answer
0
Accepted Answer

Hello Phillip,

Starting May 22, 2024 , Amazon Redshift made snapshot isolation the default for provisioned clusters. This is because Redshift utilizes OLAP architecture. This prioritizes data read over data write operations. Availability is a low-priority concern as the primary use case is analytics. Serializable isolation in this case limits concurrency on operations and most Data Warehousing applications do not need these strict guidelines. [1]

Amazon Redshift allows tables to be read while they are being incrementally loaded or modified. Transaction isolation levels in Amazon Redshift are used to enable concurrent write operations and to ensure concurrent operations are serializable.

Serializable isolation is a principle of using write locks on tables and preserves the illusion that a transaction running against a table is the only transaction that is running against that table.

For example , for two concurrently running transaction, T1 and T2 , must produce the same results as at least one of the following:

  • T1 and T2 run serially in that order.

  • T2 and T1 run serially in that order.

On the other hand , snapshot isolation allows for transactions occurring concurrently to be invisible to each other. It achieves this by having every concurrent transaction create a snapshot of the database at the beginning of the transaction. This means that concurrent transactions cannot detect each other's changes.

If any serial execution of the concurrent transactions produces the same results as their concurrent execution, those transactions are deemed "serializable" and can be run safely. If no serial execution of those transactions can produce the same results, the transaction that runs a statement that might break the ability to serialize is stopped and rolled back. [1]

There is no effect to data integrity when changing transaction isolation level for your Redshift cluster. To ensure that executions and transactions do not run into serializable isolation errors, you can learn more on managing specific behaviour of concurrent write operations here [1], [2] .

You can ensure data integrity in Redshift by performing validation checks after loading your data.

Have a look at the resources below for further reading.

  1. https://aws.amazon.com/compare/the-difference-between-olap-and-oltp/
  2. https://docs.aws.amazon.com/redshift/latest/dg/c_serial_isolation.html
  3. https://docs.aws.amazon.com/redshift/latest/dg/c_write_readwrite.html
  4. https://aws.amazon.com/about-aws/whats-new/2024/05/amazon-redshift-snapshot-isolation-provisioned-clusters/
AWS
answered a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions