Isolation levels and bankruptcy

In database systems, isolation determines how transaction integrity is visible to other users and systems.

Though it’s often used in a relaxed way, this property of ACID in a DBMS (Database Management System) is an important part of any transactional system. This property specifies when and how the changes implemented in an operation become visible to other parallel operations.

Acquiring locks on data is the way to achieve a good isolation level, so the most locks taken in an executing transaction, the higher isolation level. On the other hand, locks have an impact on performance.

image1

Isolation levels

Since isolation deals with consistency of data retrieved by queries, several levels of the property are defined.

  • Read uncommitted. This is the weakest of the four isolation levels. In this level, dirty reads are allowed, so one transaction may see not-yet-committed changes made by other transactions.  It allows all three “concurrency phenomena“: dirty reads, non-repeatable reads, and phantoms.
  • Read committed. In this isolation level, a lock-based concurrency control keeps write locks until the end of the transaction, but read locks are released as soon as the SELECT operation is performed. The non-repeatable reads phenomenon can occur in this level.
  • Repeatable reads. In this isolation level, a lock-based concurrency control keeps read and write locks until the end of the transaction. Only the rows involved in the transaction are locked, so new inserted rows could lead to phantom reads. It is also possible that to-be-locked rows (rows not locked yet in the scan process) were modified for another transaction, with similar results.
  • Serializable. This is the highest isolation level. Serializability requires read and write locks to be released at the end of the transaction, but range-locks must be acquired also when a SELECT query uses a ranged WHERE clause, in order to avoid the phantom reads phenomenon. To go deeper into isolation levels and related concepts (Multiversion Concurrency Control, snapshot isolation, serializable snapshot isolation), check this and this. For becoming an expert, check this (math-notation knowledge required).

Most of RDBMS has “read committed” as its default isolation level.

Once the context has been set, let’s tell a fiction story to show some possible effects of repeatable or phantom reads.

The way to bankruptcy, chapter 1.

The Red BankTM (not really a trade mark and none in particular is hinted) wants to diversify its investments and decides to make a sport bet for the next football match.

The Best BetTM (not really a trade mark) has a web page to manage online bets previously and during the whole duration of the match. The company have a complex algorithm to calculate the quotas (how much money per unit-money) for each win and for the tie. Its database model is just as simple as next:

MatchID Team Goals Quota Win Quota Tie
1 B 0 1.1 1.5
1 M 0 2 1.5

Before the match starts, or even till a goal is scored, team B is the favourite, with just a tick over the unit as returned money for its win. A victory of team M will be paid double the bet and the tie will be rewarded at 1.5 rate.

At the very end of the match, with the score yet 0-0, the CGO (Chief Gambling Officer) of The Red BankTM decides to access the web to make a bet to the tie.

Transaction T1 will manage the bet. It is a development from The Red BankTM that can access to the data model of The Best BetTM to check the score and the quotes. Moreover, it has an own, complex algorithm to check if the quota is fair enough. Once this has been validated, T1 submits the bet to The Best BetTM system.

There is also T2, a program from The Best BetTM, that could update the score in any moment. This is the sequence:

image2

T1 makes a SELECT, and starts to calculate what they would see as a fair quota, using the 0-0 score as parameters. Once the calculation finishes, another query is performed, to compare the result with the value of the system.

Meanwhile, team B scores, so quotas are adjusted automatically.

But the second SELECT from T1 just compares the field QUOTA_TIE, ignoring the update in the score. A quota of 3.0 for an almost finished match seems rather profiting! So T1 decides to bet… one billion € to the tie! (DON’T DO THIS AT HOME).

The match finishes and a couple of programmers are fired. The CGO is punished, but he stills keeps his job. Of course, there is an obvious bad design in the transaction T1, but the unrepeatable read could have been avoided with a higher level of isolation.

The way to bankruptcy, chapter 2.

The Red BankTM is trying to recover from a past disastrous investment, so the CGO and his new team define a new bet transaction.

Moreover, they search for another betting site, The Real BetTM, which has the next database model:

MatchID Team Minute Goals Quota Win Quota Tie
1 B 0 0 1.1 1.5
1 M 0 0 2 1.5

After the bad past experience, the new bet transaction from The Red BankTM, T1, will check the score before executing the bet.

Transaction T2, from The Real BetTM,  is the program in charge of updating the time and score.

At min 89, with 0-0, T1 is executed. This is the sequence:

image3

T1 queries for the last info and executes its validation. That’s about making a safe bet, so it will just continue after verifying that the result is OK and the match is almost finished. But team B scores at the very end of the match, so T2 inserts the new rows into the table.

T1 has queried again for checking the quotas, but it has missed the new (phantom) rows, which were inserted after the execution of the sub-select. And, hey, there are lots of money to get recovered, so T1 has gone strong: one billion euros… what will be lost (AGAIN, DON’T DO THIS AT HOME).

The match finishes and no programmers are fired… they are simply not paid, because the bank has fallen in bankruptcy. Like the first time, a higher level of isolation could have avoided the phantom read.

Protection at production?

There are real, more complex cases (especially involving triggers, or more concurrent transactions, but not only) that lead to this kind of problems. One of them was the reason of developing SSI over PostgreSQL.

Of course, those are fiction cases and (aside from a ridiculous bet strategy) there is a strong influence of bad designs in the examples. So, the debate could be: “Must databases work on a bad-design-proof basis?”

mm

Raúl Alonso

As a physicist, I've always been interested on how things work. I started my career as a developer and analyst for a consulting company and in 2011 I signed for BBVA. After some years dedicated to project management, since 2015 I am part of the Data Architecture team, where I'm specialized in databases and DaaS.

More Posts

1 thought on “Isolation levels and bankruptcy”

Comments are closed.