A couple of months ago I decided to give myself a refresher on the mechanics of InnoDB. Having a high level understanding of what’s going on under the hood can help provide the context needed in order to resolve issues you may encounter as well as assist you in ensuring that your MySQL instance is running efficiently. Everyone can stand to go back to basics every now and then, as it can help you pick up concepts that you may have missed the last time you researched the topic.
This time around I decided to give myself a refresher by re-reading the MySQL 5.6 reference manual, chapter 14, which covers the InnoDB engine. Despite having a wealth of documented knowledge and insights, I found that a lot of the points in the documentations were unclear, leading me to do more research and experimentation in order to get a bit of clarity on some of the specifics that I felt were missing. In order to help make the information a little clearer and more accessible I decided to create this blog post series on higher level InnoDB mechanics.
I believe that the most important mechanics to understand for any system are those that are exposed to and able to be configured by the user. As such, we’re going to explore mechanics by reviewing tunable InnoDB variables. My hope is to present these in a way that explores the variables, the mechanics behind them, the defaults assigned to these variables by MySQL, considerations for adjusting each variable, and showing the metrics that can be viewed that are related to the underlying mechanics if available. They will be presented in an order that will hopefully allow you to build your knowledge of InnoDB internals in a linear fashion instead of having to jump back and forth to different parts to the series.
This series will be broken down into 5 parts.
- I/O (structure and logs)
- I/O (Table data)
- Consistency / Statistics handling
Variables are grouped into categories based on their impact. For example, while innodb_log_buffer_size is technically a memory related variable, I believe it has a greater impact on I/O than it does on memory, so it’s in the I/O group.
I should note that while tuning recommendations are provided, the objective of this blog post series was NOT meant to be a tuning primer, but instead to explore the mechanics that each variable interacts with. As such, I would like to advise against reading this guide and trying to fine tune all of the available innodb variables. System variable tuning is an exercise in diminishing returns, the most benefit you’ll get out of tuning your MySQL server will occur within the first 15 minutes of configuration. In order to help prevent excess tuning I will provide a list of variables out of each section that should be set when configuring a new system.
The variables and defaults presented are what’s current for MySQL 5.6. A similar guide for 5.7 and 8.0 may follow in the coming months.
I would very much like to thank Ivan Groenewold for working with me and providing fact checking for this guide.
Let’s get started! In this part of the blog series we’re going to look at the InnoDB system variables that have the greatest impact on memory usage for InnoDB.
Definition: The is the main global memory cache established at MySQL initialization that’s used to support InnoDB operations. It mainly stores data and index pages, but it also caches other information such as undo log caching, the change buffer, etc. Everything that goes into or out of InnoDB passes through the buffer pool which makes this the single more important factor when it comes to understanding and working with InnoDB.
Default: 134217728 (bytes) (128M)
Tuning notes: When allocating space for the global buffers needed for MySQL to operate, you need to consider both the myisam key cache and the InnoDB buffer pool. If you are using myisam and InnoDB tables, be sure to allocate space for both. The objective to make these large enough to work with the active data set. For example, if you have a 1TB dataset, but only 25G is being used on a regular basis, you’ll want to make sure that the global caches are large enough to store that active data set, or at least as much of it as possible. If you don’t have enough memory to create a buffer pool large enough to house the active data set, you don’t need to immediately order more memory, the InnoDB buffer pool has a lot of mechanisms built in to handle data coming and going. Perform tests before making the decision to scale vertically.
Per the reference guide, if your buffer pool isn’t large enough to hold the full active data set you would want to consider setting it (combined with the myisam key cache) to 80% of your system memory. However I would suggest exercising caution when doing so. Consider using a tool like mysqlcalculator.com in order to ensure you don’t having MySQL consuming more than available system memory, which can cause swap and OOM issues. I have also run into cases where there were other unexpected processes consuming memory besides MySQL , meaning I had less memory to work with than I had anticipated, which is why I may set the global caches to 66% (or even less depending on the amount of memory available) if I’m not completely sure I’m working with a server that’s dedicated to running MySQL and nothing else.
Associated Metrics: In the SHOW ENGINE INNODB STATUS output, in the BUFFER POOL AND MEMORY section, you will see an entry for free buffers. If this is 0 then you know that there is no more space in the buffer pool, but this is considered to be pretty normal. Just because you have 0 space available it doesn’t meant that you haven’t allocated enough space. Buffer pool hit rate may be a better metric to use, but you will need to collect it several times to get a good representative sample to base on. If the hit rate is very low you may want to consider increasing the buffer pool size, but it would likely require an increase in system memory. If you get to this point you may want to consider altering activities of the buffer pool first (see below for other tunable variables) and also consider scaling out instead of up.
Definition: The number of buffer pool instances determines the number of individual InnoDB buffer pools that are created at MySQL startup. This is meant to reduce contention of individual caches. Each innodb buffer pool instance should be considered a ‘region’ of the InnoDB buffer pool as it divides the total size of the InnoDB buffer pool amongst the number of specified instances. IE: If you have an InnoDB buffer pool size of 8G and 8 buffer pool instances, each instance will be 1G in size.
Default: 8 (instances), However if you have a buffer pool of less than 1G, the default becomes 1. In earlier versions of 5.6 the default was 1.
Tuning notes: If you have a large InnoDB buffer pool, you should tune this so you have several instances. However, it should be noted that a hash is needed to support mapping of pages to their buffer pool instance. This hash will be needed regardless of how many instances you have if the number of instances is greater than 1. I would recommend starting with the default, but if you still find contention using the associated metric you try adding more. It’s generally considered to be a good idea to make sure no instance is less than 1G in size.
Associated Metrics: Performance schema: wait/synch/mutex/innodb/buf_pool_mutex
Definition: Pages move through the InnoDB buffer pool using a least recently used (LRU) algorithm. However there are two sections to the InnoDB buffer pool LRU list, old pages and new pages sublists. The new pages sublist sits above the old pages sublist in the LRU list. When a new page is read from disk and pulled into the InnoDB buffer pool, it’s not added to the top of the LRU list, instead it’s added to the old pages sublist and from that point one of two actions will occur. It will either be read again while it still exists in the buffer pool and will be moved to the top of the new page sublist (the very top of the LRU list), or it will not be read again and eventually will reach the bottom of the old pages sublist and be evicted from the buffer pool.
This variable designates how much of the InnoDB buffer pool should be used by the old pages sublist.
Default: 37 (percent)
Tuning notes: The reason why InnoDB doesn’t add newly read pages to the very top of the LRU list is ensure there is room for hot pages (ones that are read frequently) and so less frequently used pages don’t take up as much space and also get evicted from the buffer pool faster. However, you may have performance degradation if you have a lot of pages that are getting evicted before they ever become new, especially with smaller InnoDB buffer pools.
Associated Metrics: In the SHOW ENGINE INNODB STATUS output, in the BUFFER POOL AND MEMORY section, you’ll see information about pages that are made young (the addition to the new page section), vs the not young (pages that remained in the old page section). Unfortunately there is no specific metric that collects data about pages added to the old pages section that were evicted without being re-read, but the made young statistics may still be beneficial in gauging the improvement in the system based on changes you make to this variable.
Definition: This is how long a newly accessed page from disk must be stored in the old page sublist, regardless of how many times it’s been accessed. Once this period of time has elapsed, if the page is still in the buffer pool and if it gets accessed again it will move to the new page sublist.
Default: 1000 (milliseconds) (1 second). In earlier versions of MySQL 5.6 it was 0.
Tuning notes: This variable is meant to protect the InnoDB buffer pool from being completely overrun by full table scans. When a full table scan occurs it will move the contents of an entire table into the buffer pool where it may be read a small number of times in rapid succession, and then never be read again. Having this set to a value greater than 0 will prevent table scans from removing pages from the new page sublist that should not be removed. I would start with the default value and experiment to determine what value is correct for your workload.
Associated Metrics: In the SHOW ENGINE INNODB STATUS output, in the BUFFER POOL AND MEMORY section, you’ll see young/s and non-young/s. This is the per second average of pages in the old page sublist that were accessed and made young vs those that weren’t.
Innodb_additional_mem_pool_size (deprecated in 5.6.3, removed entirely from 5.7.4)
Definition: This is the memory that’s allocated to store data dictionary information as well as other internal data structures and information relating to connection/threads. If InnoDB runs out of memory in this space it will start allocating memory from the operating system. This is not really seen as a performance hit as most systems should be able to provide additional memory with very little overhead. However, if you go beyond the boundaries of this allocated cache, MySQL will start putting warning messages in the error log which could be concerning if you’re not familiar with this activity.
Default: 8388608 (bytes) (8mb)
Tuning notes: The more tables you have, the more likely you will need to increase this value. However I would refrain from changing the value until you know that an increase is needed by looking at the associated metrics.
Associated Metrics: In SHOW ENGINE INNODB STATUS under the section BUFFER POOL AND MEMORY you will see the text ‘additional pool allocated’ which shows the number of bytes allocated in the additional memory pool. ‘Dictionary memory allocated’ will show you how much of the additional memory pool is being used to store data dictionary information, but that’s more informative than anything else. Ideally the value of innodb_additional_mem_pool_size should be 8mb or 20% higher than the status of ‘additional pool allocated’, whichever is greater.
Definition: For a better understanding of what a hash index is and how it compares to the b-tree index mechanics often seen in InnoDB, I would recommend reading this page of the MySQL reference guide. InnoDB has a process that profiles queries against index usage on InnoDB tables. Based on its observations it will build a hash index that points to frequently accessed index pages based on a prefix of the index page in order to speed up lookups. Considering that this is a hash index, it can’t be used to speed up range searches or ordering, but it can greatly increase lookup times based on searching that point to specific records like those that use the =, <>, <=>, and IN operators.
Default: 1 (on)
Tuning notes: There is really very little control that you have of how adaptive hash index works or the algorithms that determine which pages will be moved into the hash index in MySQL 5.6. Tuning this variable requires viewing historical information based on the associated metric (see below) or experimenting using a full stack benchmark. It should be noted that adaptive hash index misses have a greater cost than if the adaptive hash index wasn’t present at all.
In the SHOW ENGINE INNODB STATUS output, if you see a lot of semaphores related to the btr0sea.c source file, one potential cause is the adaptive hash index and you might consider disabling adaptive hash indexing.
Associated Metrics: In the SHOW ENGINE INNODB STATUS output in the INSERT BUFFER AND ADAPTIVE HASH INDEX section, you will see information pertaining to the current size of the adaptive hash index, as well as how many searches used the adaptive hash index per second vs those that did not.
Out of the variables listed above, I would recommend you set the following when configuring a new MySQL instance.
Be sure to keep an eye out for the next installment of this blog series which will cover I/O pertaining to structure and logs!