At Palomino, we’ve been hard at work building the Palomino Cluster Tool. Its goal is to let you build realistically-sized[1] and functionally-configured[2] distributed databases in a matter of hours instead of days or weeks as it is at present. Today…
Read More >If you regularly manage enormous data sets, you’ve probably heard about Tumblr’s exciting new toolkit called Jetpants, which automates common processes for MySQL data management, most notably in the area of rebalancing shards for more efficient scaling. In evaluating and…
Read More >In the course of a large cluster database administrator’s job, there are dozens of times a week it can be useful to visualise some data. You’re constantly working with machines that have hundreds of databases, directories, files, log files with…
Read More >Recently I was tasked with setting up an HBase cluster to test compared against Amazon’s DynamoDB offering. The client had tested that it worked well for up to 10k updates/sec, but were concerned about the cost. I set up a…
Read More >What is Ansible? Ansible is a configuration management and deployment system, like Puppet, Capistrano, Fabric, and Chef. Its aim is to be radically simple and let you use your existing scripts to help with cluster configuration and software deployment whenever possible….
Read More >Optimizing your queries There are two general methods for creating a query plan for a query. A rule-based optimizer goes through your query, sees if you’re doing X or Y, and then optimizes for method A or B depending. Rule-based…
Read More >