Connection timeout parameters in MySQL

Pythian Marketing

March 17, 2016

Tags: Oracle E-Business Suite, Technical Track, Haproxy

Introduction

wait_timeout
interactive_timeout
net_read_timeout
net_write_timeout

What do these timeouts do in MySQL? If you search the web for one or more of these, you may find complaints that no comprehensive explanation exists for all of these timeouts in one place (besides the obvious documentation of dynamic server system variables in MySQL). This blog post seeks to provide a central documentation source for timeouts and provide some practical explanation. Knowing what timeouts do helps in a troubleshooting effort. It's good to understand when an issue is timeout related and when it's not, and to know the right reasons for changing timeout variables, or the right time to ask the developer or ad-hoc user to please tune the variables in the session, instead of asking the DBA to change the global variables. Before diving into the meat of this topic, here are a few introductory concepts that I will touch on. I mentioned session variables. The four timeout variables we are discussing: interactive_timeout, wait_timeout, net_read_timeout, net_write_timeout can all be set within the context of the session using the SET command. e.g. " set session wait_timeout=3600;". A general rule of thumb that I always try to follow: Once I have what I think are correct timeouts to protect my mission critical application, I do not want to change them unless there is a really good reason. If I'm encountering a client disconnection issue and it's not coming from my main application, I need to see if the problem can be alleviated by adjusting timeouts on the session level using SET commands. Second, it's important to note that timeouts are there to protect your server and critical application. You don't want too many clients connected to your server doing nothing (they can crowd out connections that need to do something). You don't want clients that are in a failed state due to a network connection or other interruptive problem to continue to consume resources such as locks on your database. Timeouts should be set very thoughtfully, based on the unique environment. If you're unsure, leave at defaults and adjust when needed. Finally, I created a perl script to illustrate wait_timeout situations. I made a great effort to also incorporate net_read_timeout and net_write_timeout into the script and you'll see empty functions there as place-holders. There are many layers to the "MySQL client/server onion" involving buffers on the system, client, and server layers. In lieu of providing a direct scripting examples of net_read_timeout and net_write_timeout, I hope you'll find the other methods of explanation included below, useful.

interactive_timeout

According to the MySQL manual, Interactive_timeout is only used for connections from interactive clients like the command line MySQL clients. The technical explanation is: "clients utilizing the " CLIENT_INTERACTIVE option to mysql_real_connect()." Check the specific documentation to your favorite client to see whether it fits this description. If your client fits this behavior, wait_timeout is set to the value in interactive_timeout. The only benefit to this parameter is flexibility. If your command line or other interactive clients have a vastly different requirement than your application, feel free to set this differently than wait_timeout.

wait_timeout

Wait timeout is simply there to protect you in the common case of clients sitting there doing nothing but absorbing a connection. You want to set this to the lowest acceptable number of seconds in order to protect your server against an application malfunction or some other event that may cause too many connections to the database to be opened, crowding out other clients. If a client is doing nothing for wait_timeout seconds, the MySQL server will terminate the connection. The proper setting for this variable depends on the particular environment. An environment I worked in for 3+ years set this to 120 seconds, because jboss connection pooling (and associated parameters) took care of making sure the mission critical application in the environment had available connections. A common misconception and common misuse is to try to adjust wait_timeout for a situation in which a query has been interrupted. Wait timeout only applies to idle sessions. If the connection is in any other state besides idle, wait_timeout does not apply. This is why utilities like pt-kill exist. It does not help that an idle connection being interrupted by KILL produces the same error message. Example using my perl script and the MySQL processlist:

 jscott@js-trusty1:~$ ./test_timeouts.pl -K Doing Killable Wait Timeout Test ATTN: You have 20 seconds to kill the MySQL thread!!

in another shell

 jscott@js-trusty1:~$ mysql -e "show processlist" +----+------+-------------------+--------------+---------+-------+-------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+------+-------------------+--------------+---------+-------+-------+------------------+ | 36 | js | 172.19.73.2:58892 | NULL | Sleep | 28514 | | NULL | | 79 | root | localhost:43350 | timeout_test | Sleep | 4 | | NULL | | 80 | root | localhost | NULL | Query | 0 | NULL | show processlist | +----+------+-------------------+--------------+---------+-------+-------+------------------+ jscott@js-trusty1:~$ mysql -e "kill 79";

back to the perl script output.

 DBD::mysql::st execute failed: MySQL server has gone away at ./test_timeouts.pl line 134. ------------- Err: 2006 ErrStr: MySQL server has gone away State: HY000

Note that we got MySQL error code 2006 "MySQL server has gone away". Now we'll allow wait_timeout to trigger instead of killing a process.

 jscott@js-trusty1:~$ ./test_timeouts.pl -W Doing Wait Timeout Test DBD::mysql::st execute failed: MySQL server has gone away at ./test_timeouts.pl line 151. ------------- Err: 2006 ErrStr: MySQL server has gone away State: HY000

The fact that a killed idle process and a process receiving wait_timeout get the same error message and code (2006 "MySQL server has gone away") causes much confusion. Rest assured, wait_timeout only kills idle connections, even if the error messages are the same. If there is a running query, you will typically get error code 2013: "Lost connection to MySQL server during query". The most common reason a query is killed, in my experience, is a backup. Many backup utilities for MySQL have the default behavior of killing long-running queries.

net_read_timeout

net_read_timeout is the number of seconds mysql will wait between bytes as you are sending it data. As mentioned above, I tried very hard to synthetically cause a net_read_timeout with a perl script and was unsuccessful. The best way to explain net_read_timeout is to use the example of an ETL (Extract, Transform, Load) job. At the beginning of an ETL job, you select data from one data source, then transform and load it into another data source, such as a MySQL data warehouse. Tools like "Pentaho Data Integration" (cited below) provide a GUI view of ETL. [caption id="attachment_85957" align="alignleft" width="500"]

Pentaho ETL Process[/caption] If your ETL job is complex, and you use something similar to Pentaho Kettle, you have very little visibility into what is happening behind the scenes. If the steps in between the initial query in an ETL job and an output step consume more than net_read_timeout seconds, then an error may occur due to incomplete statements being sent to the "output" steps. In mid-transaction, if the MySQL server fails to receive data within the timeout interval, the connection will be terminated. Another way net_read_timeout can occur: Normally all database connections used in a Pentaho ETL transformation are opened at the beginning of the transformation. If you have "use result streaming cursor" turned off on the database connection used for "table input", the result set may take longer than "net_read_timeout" seconds to receive, causing an error on the database connection being used for the output steps. Have you ever executed a simple query (with a large result set) on the MySQL command line and wondered why there is a pause of several seconds or even minutes prior to beginning to receive the results? Your client actually has begun receiving the results, but they simply have not been presented to you. The reason for this is, it's generally better for a MySQL server to be "finished" with a query as soon as possible; therefore, the default behavior of the MySQL client libraries on a read-only query is to receive all the results first, before passing the results onto the program. The default behavior is called " mysql_store_result". The MySQL reference manual does a good job of explaining this in detail. As the manual states, you should not change the default behavior unless you are doing a minimal amount of processing on each result row. Homemade ETL programs in other languages can have the same issue. Take care to use auto_reconnect features and/or streaming cursors / "mysql_use_result", when appropriate. ETLs and scripts are good examples of clients which can be modified with SET commands. "SET SESSION net_read_timeout=<longer value>;" should be considered for jobs outside of your main application. Lastly, net_read_timeout is also often caused by bad network connections. This is the default explanation offered in many blog posts on this topic. The Error returned to the client will be (2013) "Lost connection to MySQL server during query".

net_write_timeout

net_write_timeout is the number of seconds MySQL will wait for the client to receive additional data before terminating a connection. Using the example of the ETL job above, if steps in the middle of the transformation cause the "input" steps to pause for longer than net_write_timeout seconds, it's possible to receive an error. I've never seen net_write_timeout hit because of an ETL job. The MySQL client has default behavior of buffering result sets. Also, I try to use the "use result streaming cursor" in Pentaho or the "mysql_use_result" option in the Perl DBI, sparingly. The most typical case of receiving net_write_timeout is a backup such as a mysqldump. During a long data transfer such as a mysqldump, a myriad of factors could come into play, not the least of which is network connectivity. Remember, if you run into problems with net_write_timeout on a mysqldump backup, mysqldump has its own section of the my.cnf. You can set an option file on a client or a server to specifically set this and other variables for a mysqldump session.

Insight and analysis of technology and business strategy