Understanding MySQL Isolation Levels: Repeatable-read

Posted in: MySQL, Technical Track

Isolation levels are a rare subject in MySQL literature. The documentation provides a terse description and focuses mainly on locking issues, but does not discuss the semantics of each isolation level. This is not only a problem that affects MySQL documentation but also the SQL standard itself.

Both the lack of documentation and the absence of a deeper description of the expected behavior in the SQL standard make isolation levels a topic that is more assumed than known by database administrators and developers. In this blog post, I aim to help you understand how the default isolation level in MySQL works and show you some surprising facts about it.

But first let’s see how isolation levels are described in the standard: “The transaction isolation level of a SQL-transaction defines the degree to which the operations on SQL-data, or schemas in that SQL-transaction are affected by the effects of and can affect operations on SQL-data or schemas in concurrent SQL-transactions”. To put it in plain words, isolation levels define how concurrent transactions interact while modifying data.

MySQL uses Repeatable-read as the default level. In the standard, this level forbids dirty reads (non committed data) and non repeatable reads (executing the same query twice should return the same values) and allows phantom reads (new rows are visible). But MySQL implements it in a different way. Let’s see how it is implemented with some examples.

MySQL Repeatable-Read tests

We create two connections against a MySQL server. We will call them Session Blue and Session Red (The fact that these are the colors of FC Barcelona is purely coincidental). In Session Blue we will create the database isolation and the table repeatable_read, both will be required for this test.

Create database isolation and table repeatable_read

No phantom reads… only phantom writes!

We start a transaction, verify current isolation level by checking the value of the variable tx_isolation, and retrieve the contents of repeatable_read table, this way we create a snapshot of that table for the whole transaction.

Start transaction, check tx_isolation value and retrieve table contents

Now we move to Session Red and insert two rows into the table. We commit the transaction to make sure data is updated.

In a different session we insert two rows into the table

Next we check what data is retrieved in Session Blue.

The first session, as expected, is not able to see the data inserted in the previous step.

As we can see, repeatable-read in MySQL avoids Phantom Reads, as rows are not retrieved. This is more restrictive than the standard description of the isolation level. But, what happens if we try to update the table contents? Our intuition is that we should not update any rows.

But I'm able to update the added rows.

Surprise! The update command tells us that one row matched and one row was changed. Let’s select table contents to view what is happening.

Now we see only the rows modified in the table.

We see just one row, the row that was modified by the update executed before. This is quite unexpected and counter-intuitive as the table never had one single row committed; we inserted and committed two rows. We are seeing a view of the table that never existed. As expected, when we commit, we see both rows, the one we modified and the other that was inserted before in Session Red.

When we commit, we can see all the data.

More phantom writes!

Now we will begin another transaction and retrieve table contents. We retrieve table contents to create a snapshot (probably we should call it a version) of the table.

We start a new transaction and retrieve table contents.

Back in Session Red, we will run a transaction to update the contents of the table. Note: Phantom reads only affect new rows, not the ones already existing.

We update one of the existing rows in another session.

Let’s find what Session Blue retrieves and what can update.

Updates do not honor the values retrieved, but the values modified in a different session.

Initially, we see the table unchanged. But no rows matched when we try to update the table using the data we retrieved in the select. We see one row with value “modified” for the text column, but the update finds no rows. When we update the table using a column value that was not modified by any transaction, in this case id, then we are able to proceed. Now we see the new value for the text column.

WYSINWYW (What you see is NOT what you write)

We will return our table to the original values and we will create an additional table required for the next test.

We rebuild the table with the original values and create one more table.

As usual we start a new transaction and we retrieve table contents to create the snapshot.

Once again we start a transaction and retrieve table contents.

Now we go back to Session Red to update the contents of the whole table.

In another session, we modify table data.

Returning to Session Blue, we “clone” the contents of the table repeatable_read to repeatable_read_copy table using an insert into … select statement. After that we retrieve the values of both tables using a select.

We do an insert as select to copy the table results and we find that rows are written with the modified values

The values of rows inserted into the copy table using an insert into … as select is different than the values of rows retrieved using a regular select statement. Once we commit the transaction, as expected, we are able to see the modified data in the original table too.

Results now are as expected

Conclusions

After these tests, we have found about MySQL implementation of Repeatable-read isolation level:

  • When using just select statements is even more restrictive than SQL Standard, as Phantom Reads do not happen. Besides the snapshot is used for all tables and all rows, as we find while we use a mysqldump with –single-transaction.
  • When the transaction modifies data, the behavior is a mix of Repeatable-read (rows not modified are not visible) and Read committed (modified rows are visible). We cannot say that this is not the standard as these situations are not described in it and do not fit in the three concurrency phenomena: Dirty Read, Non-repeatable Read and Phantom Read.
  • When the transaction writes new data based on existing data, it uses the committed data, instead of the snapshot retrieved previously. This is valid both for modified and new rows, mimicking Read committed behavior.

The way MySQL implements Repeatable Read is not intuitive and, although it is required to support statement replication, can lead to some problems while running data modification and transfer to other tables in concurrent transactions. If your application can face these issues, you will need to modify your queries using select … for update statements and thus increase the number of locks in the database.

email

Interested in working with Pep? Schedule a tech call.

About the Author

Pep has a broad experience in several database platforms, but in recent years he has focused on MySQL. His work abides by the motto of Mission Control at NASA: “Tough and competent”. Tough means you are accountable for what you do or fail to do, it means compromise and responsibility. Competent means that you take nothing for granted and you must never be found short in knowledge and skills. This is how Pep feels and lives database management. He is also interested in applying Lean culture to his job and enjoys living in Barcelona with his kids and his partner.

12 Comments. Leave new

Hi,

I think we can describe this behaviour in a simple statement: Repeatable Read only applies for Selects, for DML it behaves as Read Commited.

Reply
Evan J Carroll
June 2, 2017 1:19 am

Really great blog entry, but I hope in the future you actually style this with CSS and not use hard-to-read graphics.

Reply

Thank you very much. I’m sorry if the graphics are hard to read, and also not very good for SEO, but I wanted to stress the point that these are actual screen captures. In any case, thank you again, and I’ll use CSS next time.

Reply

At the very start, when you insert two rows (using Red Session) and commit changes, the Blue Session will show two records actually (after the SELECT statement). And this is expected behaviour IMO. I just tried it. Also, why do you think that rows shouldn’t show in a different session, if changes are already committed by another session ? The transaction was successful, and other session can see the inserted rows… Can you please comment on this?

Reply

The Blue Session will not show the rows as long as the isolation level is REPEATABLE-READ and you are within a transaction. In REPEATABLE-READ you won’t see changes already committed. Make sure you are following all the steps in the Blue Session, this is “start transaction” and verify that isolation level “show variables like ‘tx_isolation'”.

Reply

Hi, thanks for your response… Well what I did is:

1. Started transaction in a Blue Session
2. Started transaction in a Red Session
3. Have inserted two rows in a table from within Red Session
4. Committed changes in a red session
5. Selected all rows in a red session, and got two rows
6. Selected all rows in blue session , and got two rows as well (while blue session transaction was still uncommitted)

Why do I see different results than in your tutorial ? I have checked, and tx_isolation variable value is REPEATABLE-READ.

Reply

@Pep Pla , Sorry I just got the results like from your tutorial.. I don’t know why it isn’t worked for me at first. But, now it is working as you described…

Reply

Okay, I actually found why sometimes a Blue Session shows no rows, and sometimes shows 2 rows. If I make a SELECT query before I commit changes in Red Session, I get empty result set. So, because the name says repeatable-read, every next read is the same as the first one, and you always get empty rows. Now, if I do a SELECT in a Blue Session for the first time, but after the COMMIT; is executed in a Red Session, I am able to see two rows added from within Red Session. You can test this out… Any insights would be appreciated.

Reply

For REPEATABLE-READ, the snapshot is created the first time data is recovered from the database, not when the session starts. This means that every change that takes place before the first data is retrieved will be visible for that session.

Reply

Hi, Thanks for the blog,,, I just have one doubt…
1.) Let’s forget about mysql,,, As a normal behaviour of repeatable-read. Do, it allows other transaction to even modify and commit the rows opened in transaction 1, or it just restricts transaction 1 to see it, but other transactions are allowed to commit the rows which are open in transaction 1 ?
In mysql, as we can see, even I tried, the Session red is able to modify and commit data which are selected in Session blue.. Do it happens with other databases also,, or they restrict other transactions ( Red Session here) to even modify and commit data.
@Pep,, Please reply if you are able to understand my question…

Reply

Unfortunately, there’s no “normal behavior of repeatable-read” as the standard doesn’t describe how should it be implemented. Different providers implement it in different ways. In some cases, it is implemented with shared locks that don’t allow the data to be changed by another session. I submitted a paper for Percona Live Europe, and (if it is approved) I’ll talk about this more in depth covering different providers and other isolation levels. Also, I plan to write more about this subject. Stay tuned!

Reply

Thanks.

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *