Whenever we do upgrades for our clients from one major version of MySQL to another we strongly recommend to test in two forms.
First, it would be a performance test between the old version and the new version to make sure there aren’t going to be any unexpected issues with the query processing rates. Secondly, do a functional test to ensure all queries that are running on the old version will not have syntactic errors or problems with reserved words in the new version that we’re upgrading to.
If a client doesn’t have an appropriate testing platform to perform these types of tests, we will leverage available tools to test to the best of our ability. More often than not this includes using pt-upgrade after capturing slow logs with long_query_time set to 0 in order to catch everything that’s running on the server for a period of time.
One of the issues you can run into with this sort of test is it has to run the queries one at a time. If you have a query that takes much longer to run in the new version this can slow things down considerably. This also gets a little frustrating if you have that long running query listed thousands of times in your slow query log.
If your objective is to run a functional test and just ensure that you’re not going to run into a syntax error in the new version, it makes no sense to run a query more than once. If it ran okay the first time, it should run okay every time assuming that the litterals in the query are also properly enclosed. So instead of replaying the entire log against the target server, we can first use pt-query-digest to create a slow log that contains one of each type of query.
Let’s take a look at the example below where I created a slow log with 5 identical write queries and 5 identical read queries.
[[email protected] slowlog]# cat ./testslow.log
......
use ptupgrade;
......
# Time: 180207 12:55:05
# [email protected]: root[root] @ localhost []
# Query_time: 0.000134 Lock_time: 0.000051 Rows_sent: 0 Rows_examined: 0
SET timestamp=1518026105;
insert into t1 (c1) values (1);
# Time: 180207 12:55:06
# [email protected]: root[root] @ localhost []
# Query_time: 0.000126 Lock_time: 0.000049 Rows_sent: 0 Rows_examined: 0
SET timestamp=1518026106;
insert into t1 (c1) values (2);
# Time: 180207 12:55:08
# [email protected]: root[root] @ localhost []
# Query_time: 0.000125 Lock_time: 0.000051 Rows_sent: 0 Rows_examined: 0
SET timestamp=1518026108;
insert into t1 (c1) values (3);
# Time: 180207 12:55:10
# [email protected]: root[root] @ localhost []
# Query_time: 0.000130 Lock_time: 0.000052 Rows_sent: 0 Rows_examined: 0
SET timestamp=1518026110;
insert into t1 (c1) values (4);
# Time: 180207 12:55:12
# [email protected]: root[root] @ localhost []
# Query_time: 0.000126 Lock_time: 0.000050 Rows_sent: 0 Rows_examined: 0
SET timestamp=1518026112;
insert into t1 (c1) values (5);
# Time: 180207 12:55:17
# [email protected]: root[root] @ localhost []
# Query_time: 0.000134 Lock_time: 0.000055 Rows_sent: 2 Rows_examined: 10
SET timestamp=1518026117;
select c1 from t1 where c1 = 1;
# Time: 180207 12:55:19
# [email protected]: root[root] @ localhost []
# Query_time: 0.000121 Lock_time: 0.000053 Rows_sent: 2 Rows_examined: 10
SET timestamp=1518026119;
select c1 from t1 where c1 = 2;
# Time: 180207 12:55:20
# [email protected]: root[root] @ localhost []
# Query_time: 0.000118 Lock_time: 0.000052 Rows_sent: 2 Rows_examined: 10
SET timestamp=1518026120;
select c1 from t1 where c1 = 3;
# Time: 180207 12:55:22
# [email protected]: root[root] @ localhost []
# Query_time: 0.000164 Lock_time: 0.000074 Rows_sent: 2 Rows_examined: 10
SET timestamp=1518026122;
select c1 from t1 where c1 = 4;
# Time: 180207 12:55:24
# [email protected]: root[root] @ localhost []
# Query_time: 0.000121 Lock_time: 0.000052 Rows_sent: 2 Rows_examined: 10
SET timestamp=1518026124;
select c1 from t1 where c1 = 5;
I then used pt-query-digest to create a new version of this slow query log with only 1 of each type of query.
[[email protected] slowlog]# pt-query-digest --limit=100% --sample 1 --no-report --output slowlog ./testslow.log
# Time: 180207 12:55:05
# [email protected]: root[root] @ localhost []
# Query_time: 0.000134 Lock_time: 0.000051 Rows_sent: 0 Rows_examined: 0
use ptupgrade;
insert into t1 (c1) values (1);
# Time: 180207 12:55:17
# [email protected]: root[root] @ localhost []
# Query_time: 0.000134 Lock_time: 0.000055 Rows_sent: 2 Rows_examined: 10
use ptupgrade;
select c1 from t1 where c1 = 1;
You’ll notice that not only did we get 1 query of each type, it also added a use statement before each query so MySQL knows what schema to run the query against when being replayed.
You can now take this new slow log and run it via pt-upgrade against your target servers.
Conclusion
If you have a large slow query log file that you are trying to test against your server using a log replay tool like pt-upgrade, you can make your life a lot simpler by getting one sample of each query using pt-query-digest. In the field we’ve seen this reduce log file sizes from hundreds of gigs to less than a meg and have reduced log replay times from weeks to minutes.
Please note that this is mainly something you’ll want to consider for functional testing as you may want to have a lot of variety with your litterals when doing a performance test.
3 Comments. Leave new
Hi Peter,
very interesting reading.
I tring to use pt-query-digerst and pt-upgrade to replicate the exact queries load from a production server to a test server with different configuration and have a benchmark review comparison.
The production server contains more 500 databases with different type of workload.
The main problem with pt-upgrade is that does not replicate thats work with isolation on single query and not with a realistic use case with more queries running in the same time.
Play back from percona do this work but is very old , very buggy and does not work with multiple databses.
Could you advise me a tool / flow to get this kind of benchmark data ?
Thnak you very much
Alex,
Performance testing is something that’s always a bit complicated as there are a few tools here and there that can help, but there is no single really good and simple solution…. yet. You can use something like percona playback or pt-upgrade and it will log replay a production workload, but it’s single threaded. You can use something like mysqlslap, which is multi threaded but only works typically on a small subset of queries that are all usually using the same literals. You could use sysbench, which is multi threaded and will vary with literal usage, but doesn’t really simulate a production workload. Currently the recommendation that I make for my clients is to use a mix of the three or create a tool that will emulate your workload, which is very handy to have but can be hard to create and maintain.
What’s exciting me right now is the potential ability to use ProxySQL to help solve this problem. You can create query rules to take a query meant for one host and send it to multiple hosts instead. This could potentially be used for both functional and performance testing. Some metrics already exist, but the logging is still a bit lacking as the last time I checked it will not record error or warning output from the mirrored hosts, but there is a feature request for this. Keep an eye out for ProxySQL, read up on the query rules section of the user guide, and let’s keep our fingers crossed that in a few months time we won’t need to worry about the ongoing challenges with performance and functional testing.
Ok , thank you very much for your suggestions.