When it comes to schema design, one of the more common issues I see for INNODB tables is in the selection of a primary key, or an absence of a primary key entirely. Today I would like to illustrate some of the best practices when it comes to selecting a primary key for your consideration as you design new tables for your schema, or modify existing ones.
Let’s start by reviewing how the table structure works in INNODB.
Everything in INNODB is an index, even the base table where you store your data, which is considered to be a clustered index. Indexes require some kind of node identification, and if a primary key is explicitly defined it will act as that clustered index node identifier and will determine the order of the data as it’s stored on disk. If no primary key is defined, INNODB will check to see if a secondary non nullable unique index is available and if so it will use that to form identification of nodes in the clustered index. If no secondary unique index is available, INNODB will still create a node identifier on the back end called DB_ROW_ID, which is an implicit identifier that increments as data is added to the table.
Here are a few points to consider which will illustrate why you should consider explicitly defining a primary key, and how it should be formed.
Without an explicit primary key, MySQL will create an identifier for you anyway but you won’t be able to use it.
As noted above, if you don’t have a primary key or a secondary non nullable unique index defined, INNDOB is going to create an identifier for you, but it’s going to create it on the back end where you won’t be able to actually use it. If you’re going to have a row identifier in your table, you might as well have it exposed so you can make use of it.
You can save disk space by creating a column for a primary key.
When you don’t have a primary key defined, you can actually end up storing more data on disk. In the following example I’ve created a schema called ‘pktest’ and have created two tables ‘test’ and ‘testpk’. Both tables have a column where a single bit is stored, but ‘testpk’ also has a second column called ‘id’ which is mediumint, is set as the primary key for the table, and is set to auto_increment. Both tables were populated with 15M records of data…
mysql> show create table test \G *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `data` bit(1) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec) mysql> show create table testpk \G *************************** 1. row *************************** Table: testpk Create Table: CREATE TABLE `testpk` ( `id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT, `data` bit(1) NOT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec) mysql> select count(*) from test; +----------+ | count(*) | +----------+ | 15000000 | +----------+ 1 row in set (21.46 sec) mysql> select count(*) from testpk; +----------+ | count(*) | +----------+ | 15000000 | +----------+ 1 row in set (4.43 sec) mysql> select data+0 from test limit 10; +--------+ | data+0 | +--------+ | 0 | | 0 | | 0 | | 1 | | 1 | | 0 | | 1 | | 1 | | 0 | | 1 | +--------+ 10 rows in set (0.06 sec) mysql> select id, data+0 from testpk limit 10; +----+--------+ | id | data+0 | +----+--------+ | 1 | 1 | | 2 | 1 | | 3 | 0 | | 4 | 1 | | 5 | 0 | | 6 | 1 | | 7 | 1 | | 8 | 1 | | 9 | 1 | | 10 | 0 | +----+--------+ 10 rows in set (0.00 sec) mysql> exit Bye [[email protected] pktest]# pwd /var/lib/mysql/pktest [[email protected] pktest]# ls -lh total 769M -rw-rw---- 1 mysql mysql 65 May 25 08:55 db.opt -rw-rw---- 1 mysql mysql 8.4K May 25 08:55 test.frm -rw-rw---- 1 mysql mysql 408M May 25 09:16 test.ibd -rw-rw---- 1 mysql mysql 8.4K May 25 09:10 testpk.frm -rw-rw---- 1 mysql mysql 360M May 25 09:20 testpk.ibd
As you can see above, the table where the primary key is defined has used less space. The reason for this is that the implicit identifier DB_ROW_ID is a 6 byte integer, where mediumint only uses 3 bytes. So if you have a table where each row can be identified by a datatype smaller than a BIGINT UNSIGNED (IE: Your table is going to have 4294967295 rows or fewer), you will save disk space by using an explicitly defined integer as a primary key.
It should also be noted that this is going to save space when it comes to secondary indexes as well as each node in the secondary index is going to reference the identifier for the clustered index.
The smaller the data footprint, the more you can store in the innodb buffer pool.
Some external tools may not work without a primary key.
One other item to note is that external tools like pt-online-schema-change won’t work unless there is an explicitly defined primary key.
[[email protected] ~]# pt-online-schema-change --alter "ENGINE=INNODB" D=pktest,t=test --user=root --password=$pw --dry-run Operation, tries, wait: copy_rows, 10, 0.25 create_triggers, 10, 1 drop_triggers, 10, 1 swap_tables, 10, 1 update_foreign_keys, 10, 1 Starting a dry run. `pktest`.`test` will not be altered. Specify --execute instead of --dry-run to alter the table. Creating new table... Created new table pktest._test_new OK. Altering new table... Altered `pktest`.`_test_new` OK. 2016-05-27T15:17:23 Dropping new table... 2016-05-27T15:17:23 Dropped new table OK. Dry run complete. `pktest`.`test` was not altered. The new table `pktest`.`_test_new` does not have a PRIMARY KEY or a unique index which is required for the DELETE trigger.
Row based replication will be less efficient without an explicit primary key.
When you replicate using row based binary logging, a full table scan is required to update the record on the slave server due to the fact that the implicit identifier DB_ROW_ID is not included in the binary log. This will apply to mixed mode binary logging as well when mysql finds a transaction to be non-deterministic in nature and converts the binary logging even to row-based.
Try to use integers as identifiers wherever possible.
You should avoid using strings (UUIDs) as identifiers whenever you can. When you insert new data into your table it’s going to order it on disk based on the order of the identifier of the primary key. So having string values where something can be inserted in the middle of the table will likely result in excess page splitting. You can help alleviate this problem by adjusting the innodb_fill_factor variable (now available in MySQL 5.7) to allow for greater space availability in each page, but overall this is going to use a lot more space on disk and isn’t going to help you anywhere near as much as changing the identifier to an integer would.
Remember, data being inserted into the clustered index has to be pushed to disk as innodb moves through its cycles and flushes out dirty pages, so there will be higher IO usage than string data that’s added to a secondary index where the change buffer may hold onto the data in memory (with supporting records in the system tablespace) for longer periods of time before finally flushing the secondary index data to disk.
One potential exception to the above comment about avoiding UUIDs is that if you are using UUID type 1 (this is what is provided with the mysql function UUID()), and if you remove the dashes and reverse the order of the first three sections of the output and use this as your primary key, you can save space and get similar insert speeds than what you may have when using an integer auto incrementing identifier with a secondary index to support the UUID. This method was outlined in a blog post by Karthik Appigatla.
In conclusion, I recommend taking the following into consideration when it comes to primary keys when creating new tables, or when your architectural cycle provides you the opportunity to revise existing tables.
- Use and explicitly defined primary key, especially if the primary key data type is going to be less than or equal to 6 bytes. MySQL is going to create one for you, you might as well have it available to you, maybe even save some space on disk, and have access to useful external tools.
- It’s preferential to use integers to identify rows in your table.
- If you must use UUIDs, and if you are using UUID type 1, reverse those first three sections and remove the dashes.
It is pretty nice blog explaining why and how we should use PK for Innodb tables!
Good explanation! Very clear…