Database – Data Concurrency, Data Consistency and Isolation Level

In the previous posts, we read about database transactions and how we can manage them in Java and PHP on different layers.

In this post, we will read about data concurrency/consistency and figure out how to properly set the isolation level of an explicit DBMS transaction. We will see several examples using three distinct RDBMS: Oracle 19c, MySQL 8 and PostgreSQL 14.

Data Concurrency and Consistency

In a multi-user database, we can have different transactions running concurrently, and the result produced must be meaningful and consistent. So, a multi-user database must ensure that users can access data simultaneously (data concurrency); and guarantee that each user sees a consistent view of the data (data consistency), including visible changes made by the user’s transactions and committed transactions of other users.

Isolation Level

Transaction isolation is one of the foundations of database processing. Isolation is the I in the acronym ACID. It is the characteristic of a DBMS transaction that tries to find the balance between performance and reliability, consistency, and reproducibility of results when multiple transactions make changes and perform queries simultaneously.

The SQL standard defines four levels of transaction isolation. These levels have differing degrees of impact on transaction processing throughput. They are:

READ UNCOMMITTED: this isolation level allows dirty reads. One transaction may see uncommitted changes made by some other transaction.

  • Oracle: not available!
  • MySQL: is available, and the SELECT statements are performed in a non-locking way!
  • PostgreSQL: PostgreSQL’s Read Uncommitted mode behaves like Read Committed!

READ COMMITTED: guarantees that any data read was committed when it is read. It restricts the reader from seeing any intermediate, uncommitted, ‘dirty’ read. It makes no promise that if the transaction re-issues the read, it will find the same data: data is free to change after it is read: a concurrent transaction may update the record, delete it, or insert new records.

  • Oracle: is available, and it’s the default level! For each execution of the SELECT statement in the function, Oracle establishes a new read-consistent snapshot. In this way, it can prevent Dirty Reads. On this level, rows are not locked!
  • MySQL: is available, and for UPDATE or DELETE statements, InnoDB holds locks only for rows that it updates or deletes. Record locks for nonmatching rows are released after MySQL has evaluated the WHERE condition.
  • PostgreSQL: is available, and it’s the default level! A SELECT query sees a snapshot of the database as of the instant the query begins to run. On this level, rows are not locked!

REPEATABLE READ: is a higher isolation level that, in addition to the “Read Committed” level, guarantees that any data read cannot change. If the transaction rereads the same data, it will find the previously read data in place, unchanged, and available to read. A concurrent transaction may add new rows.

  • Oracle: not available!
  • MySQL: is available, and it is the default isolation level. For a unique index with a unique search condition, InnoDB locks only the index record found, not the gap before. For other search conditions, InnoDB locks the index range scanned, using gap locks or next-key locks to block insertions by other sessions into the gaps covered by the range.
  • PostgreSQL: is available, and applications using this level must be prepared to retry transactions due to serialization failures because of locking applied on rows.

SERIALIZABLE: this is the highest transaction isolation level. Under this level, transactions are serialized and executed sequentially, which can avoid the phenomena described in this post. However, this transaction isolation level is inefficient and consumes database performance.

  • Oracle: is available, and a serializable transaction can modify a row only if changes to the row made by other transactions were already committed when the serializable transaction began. An error is generated when a serializable transaction tries to update or delete data changed by a different transaction that was committed after the serializable transaction began.
  • MySQL: is available, and if “autocommit” is disabled, InnoDB convert automatically all the SELECT statements applying the “for share” lock.
  • PostgreSQL: this level does not introduce any blocking beyond that present in the Repeatable Read level, but there is some overhead to monitoring and detecting the conditions which could cause a serialization anomaly that can trigger a serialization failure.

Concurrency Phenomena

These isolation levels are defined to prevent some phenomena that can occur during the concurrent execution of the transactions. These phenomena are:

  • DIRTY READS: it happens when one transaction can read data written by another transaction that is not committed yet. In the example below, Transaction 2 reads the item’s stock with a value of 11, updated by Transaction 1. After that, Transaction 1 is rollbacked, and the item’s stock value returns to the previous value (e.g. 10), but Transaction 2 is unaware of the change and holds the value 11.

  • NON-REPEATABLE (FUZZY) READ: it could happen when a transaction reads twice a row without applying a shared lock, and in the meantime, another transaction changes the value of the same row. In the example below, Transaction 2 reads 11 as the item’s stock value and starts several business operations based on that value. In the meantime, Transaction 1 updates the same row setting the value 12. After Transaction 1’s commitment, Transaction 2 rereads the same row, but the item’s stock value is now different.

  • PHANTOM READS: it occurs when, in the course of a transaction, two identical queries are executed, and the collection of rows returned by the second query is different from the first. Transaction 2 reads all the red items in the example below and fetches two rows. Transaction 1 stores a new red thing in the database. Transaction 2 reads all the red items, now finding three elements!

  • SERIALIZATION ANOMALY: It occurs when the result of a group of concurrent committed transactions cannot be achieved if we try to run them sequentially in any order without overlapping each other.

There are more than these described phenomena, and in the 1995 paper “A Critique of ANSI SQL Isolation Levels” you can find others described as well.

Let’s see now which isolation level prevents all the phenomena described above:

Isolation LevelDirty ReadNonrepeatable ReadPhantom ReadSerialization Anomaly
Read uncommittedPossible (Not in PostgreSQL)PossiblePossiblePossible
Read committedNot possiblePossiblePossiblePossible
Repeatable readNot possibleNot possiblePossible (Not in PostgreSQL)Possible
SerializableNot possibleNot possibleNot possibleNot possible

In the next post, we will read about the 2PL (Two-Phase Locking) and the MVCC (Multi-Version Concurrency Control) mechanisms that can be applied to prevent all the phenomena discussed in this reading.


Thanks for reading. Stay tuned!

Learn. Grow. Teach.

Subscribe
Notify of
0 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x