I’m not the first and certainly won’t be the last to echo the universally shared opinion that Percona’s first time at the helm of the MySQL Conference and Expo in Santa Clara was a roaring success. All of last year’s FUD surrounding their adoption of the main MySQL event was dispelled with room to wiggle. I had attended the Percona Live in London back in October of 2011 to witness first hand their conference organizing ability, and it has been optimized from great to awesome since then. After reading some tweets from @xarbp, it was apparent that the conference was going to be well-attended, and I knew of many travelling from far afield.
Flying from the UK with fellow MySQL DBA Ben Mildren (@productiondba) to join up with Raj Thukral (@thukralraj), Marco Tusa (@marcotusa), and Singer X.J Wang (@singerwang) all the way to the West Coast to attend a conference was questioned by a few friends that work in technical roles outside the MySQL/FOSS world. However, this just highlighted to me how different the MySQL community is from others. From the moment I registered for my ticket to…well…now, I’ve been excited about attending ‘the’ annual MySQL event. I hadn’t had the pleasure to attend of any of the O’Reilly arranged conferences, but I had it on good authority that this year’s Santa Clara installment was up with the best of them over the years.
We arrived on Sunday afternoon to the hotel and were greeted by friendly and helpful hotel staff, an indication to why the Hyatt is the venue of choice for the Santa Clara conference. We checked in, dumped our luggage, and took a tour of the hotel. We weren’t the first on site and we recognized some of the organizers from the Percona Live London event. With little under 48 hours to go, there was much for the Percona staff to complete before things were ready for the 1000+ attendees. We toured some of the surroundings of the hotel and could see things taking shape. There was sponsor and schedule boards up already, which enforced the fact that we had arrived and that the conference was only hours away. Then we had to await the arrival of some old and current colleagues.
Before long, Tuesday morning was upon us. Ben had a joint tutorial with Henrik Ingo about the various High Availability Solutions that are to be considered for use when architecting a MySQL environment. The tutorial was well received, and we found a few tweets to confirm how well we felt it went. It was great to have Twitter to hear/read instant feedback. The guys were happy to begin the conference’s many talks about Galera and XtraDB Cluster; many talk buzz ensued. The expo room opened after the tutorials had finished, so we took a stroll around to see the many stalls, which ranged from sponsors like NuoDB, New Relic, Facebook, and InfoBright, to support providers such as PalominoDB and SkySQL. There was also a ‘dot org’ corner for the non-profits. Amongst the ‘dot org’ were favorites, such as MariaDB, Drizzle, CentOS, and Mozilla.
For me, the conference took flight on Wednesday with the ‘break out talks’. So began talks on the many many parts of the MySQL ecosystem. I registered in good time to receive my ‘swag bag’, and with great generosity, Percona included the fabulous 3rd edition of the High Performance MySQL book published by O’Reilly and written by Baron, Peter, and Vadim. I’ve already covered a few of the chapters when travelling home – it’s a must-have reference book for anyone using or supporting MySQL.
The Keynote talks included some words from Peter and Baron of Percona, Martin Mickos now of Eucalyptus Systems, and Brian Aker of HP. I was impressed by the new HP cloud product powered by OpenStack and now with an Aker-driven DaaS backed by a tuned Percona Server. It was interesting to watch the demo video on creating new instances as well as the snapshots of existing instances to create cloned instances. I would like to review this for myself, and I will since the HP guys were offering to send beta access to the attendees.
The first session I attended was a talk on Testing MySQL databases by Percona’s Patrick Crews. Not being from a development background, I’m not overly familiar with unit testing or test driven development but I’m aware of its existence. I was curious to learn about methods that could be used for SQL. Patrick detailed the use of MTR (MySQL Test Run), RQG or RandGen, and the Kewpie test suite, which can encapsulate the use of the previously mentioned tools and be extended using custom python code. I discussed with Ben about how this could be used for creating tangible testing for benchmarking client servers and about the notion of building a grammar library of our own to allow us to build repeatable test environments. Patrick was a solid presenter and knew his chosen subject in depth, so he was at ease during his QnA session. It was overall a great talk and a fantastic start to the day.
Second, I was pleased to attend another Stewart Smith talk. I had previously seen some Drizzle introduction talks presented by Percona’s Director of Server Development and was impressed with his clarity and style of delivery. I wanted to attend this talk partly for that reason, but also because I had performed some load testing recently on a customer site and was looking for a better way to emulate the application’s true production load. That way, I could test configuration changes without using production as a test bed.
I was not disappointed. This talk was titled ‘Replaying database load with Percona Playback’. Stewart began the presentation logged into Launchpad, which seemed curious to me. When all delegates found seats and had settled, Stewart proceeded to inform us of his current status – he was releasing version 0.2 as part of the talk. A new experience for me at a conference. I haven’t heard of this being done before, and it added to my craving to hear more details about the tool.
I learned that Percona Playback is a tool that allows you to use the slow log output to emulate a server’s activity. The difference between this and a log player is that this tool is considerate to the duration of the collected duration of the log. So if you have a log with a statement that took 10 seconds to complete, the Percona Player will perform the same. If the statement finished in 5 seconds, Player will sleep for 5 minutes. The report from Percona Player will distinguish these differences. I see the value here when making configuration differences or testing the production activity on a new set of hardware. To me, it’s important to see how changes might affect the customer’s application and schema rather then an untrue sysbench or mysqlslap load.
I’m looking forward to testing this more and watching as Stewart & team add more functionality. Stewart discussed how there might be room for using libpcap from the network interface to monitor the load from a production host. The point here was to use logs to remain as least intrusive as possible to gain insight of a load not affected by the monitoring itself. Here is a link to the launchpad project.
Up third was another talk by the Percona crew. I sat among the many attendees that were packed into Ballroom ‘B’ to listen to a talk I had seen in London. I attended a repeat performance because the tools involved had evolved since London. Baron Schwartz delivered ‘Diagnosing Intermittent Performance Problems‘. In this talk, Baron described how to capture and analyse system metrics using the Percona Toolkit tools pt-stalk and pt-sift. These are special tools that were previously seen in the Aspersa toolkit, and they really take the pain out of hunting for the needle in the haystack. The guys at Percona consistently do a stellar job with their presenting. Baron in particular is one to emulate if you’re in the speaker space. He’s concise and entertaining, giving demos and war stories along the way.
The tools pt-stalk and pt-sift give the user the ability to perform a system-wide collection and analysis. Think of a tool for gaining vision to the performance of your machine, and it’s likely to be rolled into the pt-stalk script. Calls to vmstat, iostat, etc. can be triggered by a conditional check on a metric that might be causing some concern or be symptomatic of a larger issue. When complete, your output is a series of files that can be probed by the pt-sift script to pick through the volumes of information provided by pt-stalk for a time period of interest. Rest assured; I’ll be looking for excuses to use this pair to diagnose and fix without the heartache of the fragmented approach of firing and collecting per tool or re-inventing the wheel for the same gain.
After the break it was onto the Facebook talk about how they backup their vast volumes of data, ‘Backing Up Facebook‘. Interestingly, they use a combination of mysqldump and binary backups with a variety of transfer methods to capture their data. Part of the steps is to write to HDFS. This, combined with a custom tool for continuously restoring and testing their backups, was delivered with the tag line: “If you haven’t tested your backups, you don’t have them”. Although the talk was a very specific use case for Facebook and their volumes of constantly changing data, there’s also a lesson to be learned, that of the importance of taking and testing backups of your data. At Pythian, we are continually reviewing backup methods per client whilst pushing the importance of testing whether the backups are indeed consistent should a restore be necessary.
I finished up the day in the expo hall, visiting more of the many stalls to hear about the participant’s projects and new features. Ben and I spent some time with the TokuTek guys learning more about what’s involved in their version 6.0, armed with questions around backups and metric exposure for their storage engine TokuDB. They claim to have made great strides to reducing slave lag by increasing insertion rates. In doing so, slaves are less likely to be subjected to lag because of the limitations that single threaded slave inserts incur. For more information, head over to the TokuDB information page.