Another early start. But how could I miss Rasmus talking about PHP. The slides are there if you are interested. The history of PHP is fascinating, and it is interesting to see how it evolved. Rasmus also showed us how the MySQL conference site (run by O’Reilly) and the IBM site were vulnerable to Cross Site Scripting Vulnerabilities! With, of course, tips on how to avoid it, how to write better code, and how to debug performance issues.
Paul Tuckfield is next, with Scaling MySQL at YouTube, and it’s (surprise, surprise) MySQL with replication. And memcache. With Python (okay, that’s a surprise!). I’m not surprised because scaling with MySQL is much more than just throwing more hardware at the problem, and that’s been drilled in over and over again — with all the sites we have worked on and all the fast-growing sites out there. Some common sense I/O settings too (something I personally have pushed for in all the environments I am responsible for): cache reads only in the DB — avoid the filesystem read cache, and the RAID controller read cache. On the other hand, make sure you have a battery-backed write cache on the controller, and that it is working. Multiple spindles is a good idea, and raid 10 is the best for a database. Big stripe/chunk sizes are good, the biggest you can get. I should write a blog on db server disk setup separately, its such a huge topicâ€¦ so much information, so little time!
One more thing Paul mentioned which I found intriguing was speeding up replication by predictive caching on the replica slaves. Parse the binlog outside and ahead of the slave, convert updates into selects, and run the selects on the slave to predictively cache the db blocks that are going to be updated. It sounds a little complicated and while it had the audience (and me) wow-ing it when he first mentioned it, on second thought, I’m sure there’s other variables at play, since the regular reads on the slave will also interact with these predictive reads, and it may lead to poorer read performance on the slave, even if the updates are a little faster. But an intriguing concept nonetheless, and worth a look.
Zmanda is next with their enterprise backup system based on Amanda. We use Amanda at Pythian, and it’s worked pretty well for us in the 3 years since I implemented it. I have had to recover from tape a few times and it has been quite painless. As any DBA knows, backups are the one thing that can make or break an organization, and the only time you find out how good your backups are (and if you still have your job) is when you need to use the backup. I’m vendor-agnostic as far as backup solutions go, but I do believe in regular testing and validation of backups. There’s no point in having backups that you have never tested. Oh, and RAID is not backup. Another blog in the making!
Sometimes a little downtime is a good thing — I managed to get suspend/hibernate working on my laptop during the keynotes. Just needed a little bit of tweaking. I commented out the hwclock syncs (I know those are problematic on my laptop), and I added in the ndiswrapper and nvidia modules into the list of modules to be removed before suspend/hibernate and reload after. I’ll investigate which one is the cause of the issue by working backwards, but this works for me now and I’m happy.
There is a closing keynote at 3pm, will keep you posted as usual.