At PalominoDB, we constantly evaluate new technologies and database options for our clients’ environments. NoSQL databases are especially popular right now, and Riak is an increasingly-recommended option for highly available, fault-tolerant scenarios. Moss Gross attended an introductory workshop, and shares his findings here. For more on Riak, please see the Basho wiki.
What is Riak?
Some of the key features of Riak include:
– license is apache2
– key-value store
– eventually consistent (Given a sufficiently long time, all nodes in the system will agree. Depending on your requirements, this could be a determining factor on whether you should use Riak)
– highly available
– each node has one file
– supports map/reduce for spreading out query processing among multiple processes
Key Concept Definitions
Node: One running instance of the Riak binary, which is an Erlang virtual machine.
Cluster: A group of Riak nodes. A cluster is almost always in a single datacenter.
Riak Object: One fundamental unit of replication. An object includes a key, a value, and may include one or more pieces of metadata.
Value: Since it is a binary blob, it can be any format. The content-type of the value is given in the object’s metadata.
Metadata: Additional headers attached to a Riak Object. Types of metadata include content-type, links, secondary indexes
Vector Clock: An opaque indicator that establishes the temporality of a Riak object. This is used internally by Erlang, and the actual timestamp is not intended to be read by the user from the vector clock.
Bucket: Namespace. This can be thought of as an analog to a table, however Riak namespaces are unlike tables in any way. They indicate a group of Riak objects that share a configuration.
Link: Unidirectional pointer from one Riak object to another. Since an object can have multiple links, bi-directional behavior is possible. These links use the HTTP RFC.
Secondary Index: An index used to tag an object for a fast lookup. This is useful for one-to-many relationships
More Key Concepts
Riak uses consistent hashing, which can be thought of as a ring which maps all the possible Riak objects, the number of which can be up to 2^160.
Partition: Logical Division of the ring. This corresponds to logical storage units on the disk. The number of partitions must be a power of two, and in practice should be generally very high. The default number of partitions is 64, and this is considered a very small number.
To insert data, the basic operation is a PUT request to one of the hosts. The bucket is indicated in the address, and metadata is added via headers to the request. For example: curl -v https://host:8091:/buckets/training/keys/my-first-object -X PUT -H “content-type: text/plain” -d “My first key”
To insert data without a key, use a POST request. To retrieve an object, use GET, and to delete, use DELETE.
Riak doesn’t have any inherent locking. This must be handled in the application layer.
Riak has two configuration files, located in /etc/ by default.
vm.args: identifies the node to itself and other clusters. The name of the node is of the form ‘name@foo’, where ‘name’ is a string and ‘foo’ can be an ip or a hostname, and it must resolve to a machine, using /etc/hosts or DNS, etc.
app.config: identifies the ringstate directory and the addresses and ports that the node listens on.
There are four main logs
console.log: All the INFO, WARN, and ERR messages from the node.
crash.log: crash dumps from the Erlang VM
error.log: just the ERR messages from the node
run_erl.log: logs the start and stop of the master process
Things to check for proper operation:
Locally from the machine the node is on, you can run the command ‘riak-admin status’. This gives one minutes stats for the node by name and cluster status
nodename: ‘xxxx’ (compare to what the rest of the cluster things it should be)
connected_nodes (verify it’s what’s expected)
ring_members (includes OOC members)
ring_ownership (has numbers that should show a general balance in
indexes across all of the nodes, if one is significantly different,
indicates a problem)
Riak is very versatile in that you have many choices as to what to use for your storage backend. Some of the possibilities include memory, Bitcask (a homegrown solution written in Erlang, which keeps just a hashmap of keys in memory), LevelDB (which will allow secondary indexing, uses compaction, and does not require you to store all keys in memory), or a combination. They recommend actually using different storage engines for different buckets if it seems more appropriate.
Creating backups in Riak is relatively straightforward:
1. stop node
2. copy data directory (if not in memory)
3. start node back up
4. let read repair handle the rest
Monitoring and Security
– There is no built-in security in Riak – it’s up to the administrator to add access restrictions via http authentication, firewalls, etc. The commercial product, Riak CS, has ACLs and security built-in however.
– JMX monitoring is built-in and can be enabled in the application configuration – just specify the port.