Isolation levels are a rare subject in MySQL literature. The documentation provides a terse description and focuses mainly on locking issues, but does not discuss the semantics of each isolation level. This is not only a problem that affects MySQL documentation but also the SQL standard itself.
Both the lack of documentation and the absence of a deeper description of the expected behavior in the SQL standard make isolation levels a topic that is more assumed than known by database administrators and developers. In this blog post, I aim to help you understand how the default isolation level in MySQL works and show you some surprising facts about it.
But first let’s see how isolation levels are described in the standard: “The transaction isolation level of a SQL-transaction defines the degree to which the operations on SQL-data, or schemas in that SQL-transaction are affected by the effects of and can affect operations on SQL-data or schemas in concurrent SQL-transactions”. To put it in plain words, isolation levels define how concurrent transactions interact while modifying data.
MySQL uses Repeatable-read as the default level. In the standard, this level forbids dirty reads (non committed data) and non repeatable reads (executing the same query twice should return the same values) and allows phantom reads (new rows are visible). But MySQL implements it in a different way. Let’s see how it is implemented with some examples.
MySQL Repeatable-Read tests
We create two connections against a MySQL server. We will call them Session Blue and Session Red (The fact that these are the colors of FC Barcelona is purely coincidental). In Session Blue we will create the database isolation and the table repeatable_read, both will be required for this test.
No phantom reads… only phantom writes!
We start a transaction, verify current isolation level by checking the value of the variable tx_isolation, and retrieve the contents of repeatable_read table, this way we create a snapshot of that table for the whole transaction.
Now we move to Session Red and insert two rows into the table. We commit the transaction to make sure data is updated.
Next we check what data is retrieved in Session Blue.
As we can see, repeatable-read in MySQL avoids Phantom Reads, as rows are not retrieved. This is more restrictive than the standard description of the isolation level. But, what happens if we try to update the table contents? Our intuition is that we should not update any rows.
Surprise! The update command tells us that one row matched and one row was changed. Let’s select table contents to view what is happening.
We see just one row, the row that was modified by the update executed before. This is quite unexpected and counter-intuitive as the table never had one single row committed; we inserted and committed two rows. We are seeing a view of the table that never existed. As expected, when we commit, we see both rows, the one we modified and the other that was inserted before in Session Red.
More phantom writes!
Now we will begin another transaction and retrieve table contents. We retrieve table contents to create a snapshot (probably we should call it a version) of the table.
Back in Session Red, we will run a transaction to update the contents of the table. Note: Phantom reads only affect new rows, not the ones already existing.
Let’s find what Session Blue retrieves and what can update.
Initially, we see the table unchanged. But no rows matched when we try to update the table using the data we retrieved in the select. We see one row with value “modified” for the text column, but the update finds no rows. When we update the table using a column value that was not modified by any transaction, in this case id, then we are able to proceed. Now we see the new value for the text column.
WYSINWYW (What you see is NOT what you write)
We will return our table to the original values and we will create an additional table required for the next test.
As usual we start a new transaction and we retrieve table contents to create the snapshot.
Now we go back to Session Red to update the contents of the whole table.
Returning to Session Blue, we “clone” the contents of the table repeatable_read to repeatable_read_copy table using an insert into … select statement. After that we retrieve the values of both tables using a select.
The values of rows inserted into the copy table using an insert into … as select is different than the values of rows retrieved using a regular select statement. Once we commit the transaction, as expected, we are able to see the modified data in the original table too.
After these tests, we have found about MySQL implementation of Repeatable-read isolation level:
- When using just select statements is even more restrictive than SQL Standard, as Phantom Reads do not happen. Besides the snapshot is used for all tables and all rows, as we find while we use a mysqldump with –single-transaction.
- When the transaction modifies data, the behavior is a mix of Repeatable-read (rows not modified are not visible) and Read committed (modified rows are visible). We cannot say that this is not the standard as these situations are not described in it and do not fit in the three concurrency phenomena: Dirty Read, Non-repeatable Read and Phantom Read.
- When the transaction writes new data based on existing data, it uses the committed data, instead of the snapshot retrieved previously. This is valid both for modified and new rows, mimicking Read committed behavior.
The way MySQL implements Repeatable Read is not intuitive and, although it is required to support statement replication, can lead to some problems while running data modification and transfer to other tables in concurrent transactions. If your application can face these issues, you will need to modify your queries using select … for update statements and thus increase the number of locks in the database.
I think we can describe this behaviour in a simple statement: Repeatable Read only applies for Selects, for DML it behaves as Read Commited.
In fact, the statement should be that Repeatable Read only applies for non-locking queries.
If you do a locking SELECT statement using FOR UPDATE or even FOR SHARE aka LOCK IN SHARE MODE, the query reads as though you were using a Read Committed isolation level.
You can even switch back and forth in the same Repeatable Read transaction — when you use locking SELECT statements, you can see different rows than when you use non-locking SELECT statements.
Really great blog entry, but I hope in the future you actually style this with CSS and not use hard-to-read graphics.
Thank you very much. I’m sorry if the graphics are hard to read, and also not very good for SEO, but I wanted to stress the point that these are actual screen captures. In any case, thank you again, and I’ll use CSS next time.
At the very start, when you insert two rows (using Red Session) and commit changes, the Blue Session will show two records actually (after the SELECT statement). And this is expected behaviour IMO. I just tried it. Also, why do you think that rows shouldn’t show in a different session, if changes are already committed by another session ? The transaction was successful, and other session can see the inserted rows… Can you please comment on this?
The Blue Session will not show the rows as long as the isolation level is REPEATABLE-READ and you are within a transaction. In REPEATABLE-READ you won’t see changes already committed. Make sure you are following all the steps in the Blue Session, this is “start transaction” and verify that isolation level “show variables like ‘tx_isolation'”.
Hi, thanks for your response… Well what I did is:
1. Started transaction in a Blue Session
2. Started transaction in a Red Session
3. Have inserted two rows in a table from within Red Session
4. Committed changes in a red session
5. Selected all rows in a red session, and got two rows
6. Selected all rows in blue session , and got two rows as well (while blue session transaction was still uncommitted)
Why do I see different results than in your tutorial ? I have checked, and tx_isolation variable value is REPEATABLE-READ.
@Pep Pla , Sorry I just got the results like from your tutorial.. I don’t know why it isn’t worked for me at first. But, now it is working as you described…
Okay, I actually found why sometimes a Blue Session shows no rows, and sometimes shows 2 rows. If I make a SELECT query before I commit changes in Red Session, I get empty result set. So, because the name says repeatable-read, every next read is the same as the first one, and you always get empty rows. Now, if I do a SELECT in a Blue Session for the first time, but after the COMMIT; is executed in a Red Session, I am able to see two rows added from within Red Session. You can test this out… Any insights would be appreciated.
For REPEATABLE-READ, the snapshot is created the first time data is recovered from the database, not when the session starts. This means that every change that takes place before the first data is retrieved will be visible for that session.
Hi, Thanks for the blog,,, I just have one doubt…
1.) Let’s forget about mysql,,, As a normal behaviour of repeatable-read. Do, it allows other transaction to even modify and commit the rows opened in transaction 1, or it just restricts transaction 1 to see it, but other transactions are allowed to commit the rows which are open in transaction 1 ?
In mysql, as we can see, even I tried, the Session red is able to modify and commit data which are selected in Session blue.. Do it happens with other databases also,, or they restrict other transactions ( Red Session here) to even modify and commit data.
@Pep,, Please reply if you are able to understand my question…
Unfortunately, there’s no “normal behavior of repeatable-read” as the standard doesn’t describe how should it be implemented. Different providers implement it in different ways. In some cases, it is implemented with shared locks that don’t allow the data to be changed by another session. I submitted a paper for Percona Live Europe, and (if it is approved) I’ll talk about this more in depth covering different providers and other isolation levels. Also, I plan to write more about this subject. Stay tuned!
Wonderful post! We will be linking to this particularly great post on our site.
Keep up the great writing.
Hi Pep Pla,
First off, thanks for a great article. Really glad I came across this article.
I saw this whilst researching tx isolation levels and noticed MySQL repeatable reads do NOT have phantom reads.
The great thing with this apparent odd behaviour regarding “phantom writes” is that it allows for the implementation of optimistic locking which is something even possible with read_committed isolation level.
Suppose we had a table — accounts(id, amount, version) and 2 sessions (S1 and S2):
(Imagine accounts had values for id=1, amount=100 and version = 1)
S1 starts transaction and then reads accounts table [id=1, amount=100, version=1]
S2 starts transaction and then reads accounts table [id=1, amount=100, version=1]
S1 then updates accounts with a query like: update accounts set amount = 150, version = 2 where id=1 and version = 1;
After this a select within S1 should values of [id=1, amount=150, version=2].
S1 then commits the transaction.
if S2 now attempts to update the same row eg. update accounts set amount = 400, version = 2 where id=1 and version = 1;
What happens is that 0 rows will be updated because [version] is no longer equal to 1 but rather 2 (even though a select before the update in S2 will still have shown version as 1;
Since 0 rows are returned we know another session has updated this row. If this were not the case this update statement would have also successfully updated the row and committed the transaction which would depending on the application logic be probably not be what is intended. So the fact that these “phantom writes” happen in repeatable reads is actually quite a useful feature.
I’ve been absent for a while, but now I remember why I used to love
this website. Thanks, I will try and check back more frequently.
How frequently you update your web site?
I’d forever want to be update on new blog posts on this site, saved to bookmarks!
because, innodb does current-read implicitly at first with in update. session B will read the session A commit data with current-read. So session B can update the data session A commit. And then session B can read that data.
I have 1 query as I am getting confused and not able to find a correct answer-
In this blog, in heading “How database locking works” , for repeatable_reads, its written that –
select in transaction 1 applies lock on that row, so if transaction 2 tries to update during that time, it will have to wait.
But in your tried steps – you started select in transaction 1 and then update in transaction 2 on the same row and it went through fine.
Sorry forgot to mention the blog link above – https://retool.com/blog/isolation-levels-and-locking-in-relational-databases/#:~:text=You%20can%20think%20of%20isolation,even%20on%20a%20transaction%20level).