Riak on Freebsd ZFS Jails

This is a ramshackle and slipshod introduction to establishing a specific flavor of FreeBSD environment under which to evaluate, test, or play with Riak, the fantastic distributed database by Basho Technologies, Inc.

PLEASE NOTE: This content contains outdated and possibly irrelevant information.

The configuration and processes described in this guide should not be construed as advice for use in production environments. Providing such information is beyond the scope of this guide as well.

Goals For This Guide

Let’s get a clear understanding of what I’m going to be describing to you in this guide out of the way before we roll up our sleeves and commence to hasty typing, bug smashing, and eventual triumphant success with accompanying celebratory beverages.

The point of this guide is do something closely resembling the following on my modest bare metal quad-core machine with only 16GB RAM, and approximately 200GB of SAS 10k rotational storage:

  1. Install FreeBSD (I am using version 8.3 for my example system) on suitable hardware for running a small test Riak cluster.
  2. Configure the system to function as a jail environment host and allocate some of the available storage for a ZFS zpool for jail environments.
  3. Use ezjail with ZFS specific functionality to create a base jail environment plus a flavor of jail which will comprise each Riak node
  4. Creation and minimal configuration of each Riak node
  5. Launch nodes, join nodes, and test our cluster in various ways
  6. Discuss minimal system tuning for best use in certain scenarios
  7. Discuss tools for operational and performance testing and monitoring of the cluster (ex: Basho Bench, Riaknostic, Riemann)

The Host Machine

The following notes pertain to work done on the physical server for establishing a jails environment.

Savor The Riak Jail Flavor

We will first build and lightly configure a Riak jail flavor, so that spinning up subsequent Riak nodes will take much less time. The goal here is to build a Riak node, and do initial common software package installation, but without any unique configuration.

Riak, Dependencies & Supporting Software

  • Riak 1.1.2
  • Erlang R14B04
  • Curl
  • Git
  • Screen
  • Sudo
  • Vim
  • Bash

After the jail is booted, we can install Riak dependencies and supporting software. As of this writing and with FreeBSD 8.3-RELEASE, the ports tree offers Erlang R14B04, which is compatible with the Riak version 1.1.2 we’ll be using.

Use your preferred options/discretion for particular builds of the software ports listed below:

$ cd /usr/ports/lang/erlang && make install distclean && \
  cd /usr/ports/ftp/curl && make install distclean && \
  cd /usr/ports/devel/git && make install distclean && \
  cd /usr/ports/sysutils/screen && make install distclean && \
  cd /usr/ports/security/sudo && make install distclean && \
  cd /usr/ports/editors/vim-lite && make install distclean

Alternatively, you can install package versions of the above like so:

$ pkg_add -r curl erlang git screen sudo vim-lite

Notes for Building Riak on FreeBSD 8.3-RELEASE

The BSD make(1) dance: BSD has its own build toolchain in addition to some GNU build tools, such as gmake. This can pose an issue when building Riak, so we can perform a small, rather inelegant, but temporary workaround:

From within the jail host:

$ cd /usr/local/jails/basejail/usr/bin
$ sudo mv make bsd-make

From within the jail being used to build the Riak flavor:

$ sudo ln -sf /usr/local/bin/gmake /usr/local/bin/make

Now we can download the Riak 1.1.2 source, and compile a rel release on the initial node:

$ export RURL=downloads.basho.com.s3-website-us-east-1.amazonaws.com/
$ curl -O "$RURL"/riak/1.1/1.1.2/riak-1.1.2.tar.gz
$ tar zxf riak-1.1.2.tar.gz
$ cd riak-1.1.2
$ gmake rel

Provided the build was successful, you now have a complete Riak node contained in rel/riak. Check out the official Basho documentation on Basic Cluster Setup to learn about configuring a Riak cluster.

NOTE: Don’t forget to undo the hackity gmake/make stuff so that you’ll be able to build ports, etc. later.

Wash hands, dry on pants and so forth.

At this point, one can also use git to pull in custom environment settings, dotfiles, etc. to ease operation and administration of the nodes.

System Tuning Notes

Here are some minimal system tuning notes. If you want more detailed information, see Basho’s system tuning documentation.

Open Files Limit

Riak opens many files during operation; ensure that the open files limit is set to an adequate level for operation. The default minimum value is 4906.

Mounting With noatime

Make sure that you use the noatime option for mounting any volumes which will store Riak data.

Clustering Notes

After bootstrapping each node, you’ll need to assemble them into a cluster. In this example, I’ve got five nodes using a resolvable fully qualified domain name scheme like this: riak1.example.com, riak2.example.com, …

A great start to understanding building a basic cluster is the Basho documentation Basic Cluster Setup.

The simplest process for joining the nodes to form a cluster is to issue the riak-admin join command from N-1 nodes instructing them to join the first node, like so:

$ ssh riak2.example.com riak-admin join riak@riak1.example.com
Success: sent join request to 'riak@riak1.example.com'
$ ssh riak3.example.com riak-admin join riak@riak1.example.com
Success: sent join request to 'riak@riak1.example.com'
$ ssh riak4.example.com riak-admin join riak@riak1.example.com
Success: sent join request to 'riak@riak1.example.com'
$ ssh riak5.example.com riak-admin join riak@riak1.example.com
Success: sent join request to 'riak@riak1.example.com'

As shown in the above example, you should expect to see a success message for each node’s join attempt. If there are problems, you could see an error message instead.

riak-admin ringready

After a bit of time and provided you saw success messages for each join request, you can verify that the cluster is ready with a command like this executed from any of the nodes:

$ ssh riak1.example.com riak-admin ringready
Attempting to restart script through sudo -H -u riak
TRUE All nodes agree on the ring ['riak@riak1.example.com',
                                  'riak@riak2.example.com',
                                  'riak@riak3.example.com',
                                  'riak@riak4.example.com',
                                  'riak@riak5.example.com']

Some other initial checks of the cluster you can try are riak-admin test and gathering status through riak-admin status and the HTTP interface.

riak-admin test

$ ssh riak1.example.com riak-admin test
Attempting to restart script through sudo -H -u riak
Successfully completed 1 read/write cycle to 'riak@riak1.example.com'

riak-admin status

HTTP stats

Use curl to get stats from the HTTP interface like so:

$ curl loadbalancer.example.com/stats

You can run curl directly against your load balancer IP as in the example above or against something like localhost:8098 when on the node directly, but you should not expose Riak directly to any public interfaces.

Instead, always run Riak behind some kind of load balancer or reverse proxy arrangement.

Diagnose, Manage & Monitor

Diagnosing Issues

One great utility for performing initial diagnosis of troubles with your Riak cluster is Riaknostic.

Riaknostic ties into the riak-admin command, providing a diag subcommand that can be run alone for a wealth of diagnostic or with options for specific checks.

Managing Your Cluster

Riak is an operations friendly database and the included command line tools provide an excellent management interface. You are strongly encouraged to familiarize yourself with the riak and riak-admin commands if you will be interacting with Riak often via the command line.

Additional tools are available for managing Riak with a more user friendly, contemporary web user interface. The most popular of such tools is Basho’s own Riak Control.

Riak Control

Riak Control ships with Riak and can be enabled quite easily to provide a more visually oriented interface to Riak’s inner workings.

Riak Control provides the following functionality out of the box:

More functionality is in the works. Check out the official documentation for Riak Control to learn more.

Performance Testing

A popular tool for performance testing, bulk loading keys, and other testing odds and ends is Basho Bench.

Basho Bench can performance test various aspects of a system like Riak and then generate nice R based graphs as a final output option.

References