- Newest
- Most votes
- Most comments
Hello Phillip,
Starting May 22, 2024 , Amazon Redshift made snapshot isolation the default for provisioned clusters. This is because Redshift utilizes OLAP architecture. This prioritizes data read over data write operations. Availability is a low-priority concern as the primary use case is analytics. Serializable isolation in this case limits concurrency on operations and most Data Warehousing applications do not need these strict guidelines. [1]
Amazon Redshift allows tables to be read while they are being incrementally loaded or modified. Transaction isolation levels in Amazon Redshift are used to enable concurrent write operations and to ensure concurrent operations are serializable.
Serializable isolation is a principle of using write locks on tables and preserves the illusion that a transaction running against a table is the only transaction that is running against that table.
For example , for two concurrently running transaction, T1 and T2 , must produce the same results as at least one of the following:
-
T1 and T2 run serially in that order.
-
T2 and T1 run serially in that order.
On the other hand , snapshot isolation allows for transactions occurring concurrently to be invisible to each other. It achieves this by having every concurrent transaction create a snapshot of the database at the beginning of the transaction. This means that concurrent transactions cannot detect each other's changes.
If any serial execution of the concurrent transactions produces the same results as their concurrent execution, those transactions are deemed "serializable" and can be run safely. If no serial execution of those transactions can produce the same results, the transaction that runs a statement that might break the ability to serialize is stopped and rolled back. [1]
There is no effect to data integrity when changing transaction isolation level for your Redshift cluster. To ensure that executions and transactions do not run into serializable isolation errors, you can learn more on managing specific behaviour of concurrent write operations here [1], [2] .
You can ensure data integrity in Redshift by performing validation checks after loading your data.
Have a look at the resources below for further reading.
- https://aws.amazon.com/compare/the-difference-between-olap-and-oltp/
- https://docs.aws.amazon.com/redshift/latest/dg/c_serial_isolation.html
- https://docs.aws.amazon.com/redshift/latest/dg/c_write_readwrite.html
- https://aws.amazon.com/about-aws/whats-new/2024/05/amazon-redshift-snapshot-isolation-provisioned-clusters/
Relevant content
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago