Riak on FreeBSD Jails

PLEASE NOTE: This content contains outdated and possibly irrelevant information.

Riak Logo

This is an introduction to building a specific flavor of FreeBSD environment under which to test or play with Riak, the distributed database by Basho Technologies, Inc.

You should not take the configuration and processes I describe in this guide as advice for use in production environments. I wrote this guide to provide tips to those who want to set up a similar development or testing environment.

Goals for this guide

Let’s get a clear understanding of what you can learn in this guide before you roll up your sleeves, and start with the hasty typing, bug smashing, and eventual triumphant success with accompanying celebratory beverages.

This guide can help you build something closely resembling the following setup:

  1. Install FreeBSD (I used version 8.3 for my example system) on suitable hardware for running a small test Riak cluster.
  2. Configure the system to function as a jail environment host and provide some of the available storage for a ZFS zpool for jail environments.
  3. Use ezjail with ZFS specific functionality to create a base jail environment plus a flavor of jail for each Riak node.
  4. Create a minimal configuration for each Riak node.
  5. Launch nodes, join nodes, and test your cluster.
  6. Discuss minimal system tuning for certain use cases.
  7. Discuss tools for operational and performance testing and monitoring of the cluster (for example: Basho Bench, Riaknostic, Riemann).

The host machine

The following notes pertain to work done on the physical server for establishing a jails environment.

Savor The Riak jail flavor

First, build and lightly configure a Riak jail flavor, so that spinning up many Riak nodes will require less time. The goal here is to build a Riak node, and do initial common software package installation, but without any unique configuration.

Riak and dependencies

You need the following software to run a Riak node:

  • Riak 1.1.2
  • Erlang R14B04
  • Curl
  • Git
  • Screen
  • sudo
  • vim
  • Bash

After you boot the jail, you can install Riak dependencies and supporting software. As of this writing and with FreeBSD 8.3-RELEASE, the ports tree offers Erlang R14B04, which is compatible with the Riak version 1.1.2 you will use.

Use your preferred options/discretion for particular builds of the software ports listed below:

cd /usr/ports/lang/erlang && make install distclean && \
cd /usr/ports/ftp/curl && make install distclean && \
cd /usr/ports/devel/git && make install distclean && \
cd /usr/ports/sysutils/screen && make install distclean && \
cd /usr/ports/security/sudo && make install distclean && \
cd /usr/ports/editors/vim-lite && make install distclean

If you prefer, you can install package versions of the above like so:

$ pkg_add -r curl erlang git screen sudo vim-lite

Notes for building Riak on FreeBSD 8.3-RELEASE

The BSD make(1) dance: BSD has its own build along to some GNU build tools, such as gmake. This can pose an issue when building Riak, so you can perform a small, rather inelegant, but temporary workaround:

From within the jail host:

cd /usr/local/jails/basejail/usr/bin && \
sudo mv make bsd-make

From within the jail to build the Riak flavor, link gmake to make:

$ sudo ln -sf /usr/local/bin/gmake /usr/local/bin/make

Now you can download the Riak 1.1.2 source, and compile a rel release on the initial node:

$ export RURL=downloads.basho.com.s3-website-us-east-1.amazonaws.com/
$ curl -O "$RURL"/riak/1.1/1.1.2/riak-1.1.2.tar.gz
$ tar zxf riak-1.1.2.tar.gz
$ cd riak-1.1.2
$ gmake rel

Provided your build succeeded, you now have a complete Riak node contained in rel/riak. Check out the official Basho documentation on Basic Cluster Setup to learn about configuring a Riak cluster.

NOTE: Don’t forget to undo the gmake/make link so that you can still build ports, etc. later.

$ rm -f /usr/local/bin/make

Wash hands, dry on pants and so forth.

You can use git to pull in custom environment settings, dot files, etc. to ease operation and administration of the nodes.

System tuning notes

These minimal system tuning notes should help you get started. If you want more detailed information, review the system tuning documentation.

Open files limit

Riak opens many files during operation; ensure that the open files limit is set to an adequate level for operation. The default minimum value is 4906.

Mounting with noatime

Make sure that you use the noatime option for mounting any volumes which will store Riak data.

Clustering notes

After bootstrapping each node, you’ll need to assemble them into a cluster. In this example, I’ve got five nodes using a resolvable fully qualified domain name scheme like this: riak1.example.com, riak2.example.com, …

A great start to understanding building a basic cluster is the Basho documentation Basic Cluster Setup.

The simplest process for joining the nodes to form a cluster is to issue the riak-admin join command from N-1 nodes instructing them to join the first node, like so:

$ ssh riak2.example.com riak-admin join riak@riak1.example.com
Success: sent join request to 'riak@riak1.example.com'
$ ssh riak3.example.com riak-admin join riak@riak1.example.com
Success: sent join request to 'riak@riak1.example.com'
$ ssh riak4.example.com riak-admin join riak@riak1.example.com
Success: sent join request to 'riak@riak1.example.com'
$ ssh riak5.example.com riak-admin join riak@riak1.example.com
Success: sent join request to 'riak@riak1.example.com'

As shown in the above example, you should expect to see a success message for each node’s join attempt. If there are problems, you could see an error message instead.

riak-admin ringready

After a bit of time and provided you saw success messages for each join request, you can verify that the cluster is ready with a command like this executed from any of the nodes:

$ ssh riak1.example.com riak-admin ringready
Attempting to restart script through sudo -H -u riak
TRUE All nodes agree on the ring ['riak@riak1.example.com',
                                  'riak@riak2.example.com',
                                  'riak@riak3.example.com',
                                  'riak@riak4.example.com',
                                  'riak@riak5.example.com']

Some other initial checks of the cluster you can try are riak-admin test and gathering status through riak-admin status and the HTTP interface.

riak-admin test

$ ssh riak1.example.com riak-admin test
Attempting to restart script through sudo -H -u riak
Successfully completed 1 read/write cycle to 'riak@riak1.example.com'

riak-admin status

HTTP stats

Use curl to get stats from the HTTP interface like so:

$ curl loadbalancer.example.com/stats

You can run curl directly against your load balancer IP as in the example above or against something like localhost:8098 when on the node directly, but you should not expose Riak directly to any public interfaces.

Instead, always run Riak behind some kind of load balancer or reverse proxy arrangement.

Diagnose, manage & monitor

Diagnosing issues

One great utility for performing initial diagnosis of troubles with your Riak cluster is Riaknostic.

Riaknostic ties into the riak-admin command, providing a diag sub-command that you can run alone for a wealth of diagnostic or with options for specific checks.

Manage your cluster

Riak is an operations friendly database and the included command line tools provide a great management interface. I strongly encourage you to familiarize yourself with the riak and riak-admin commands if you will be interacting with Riak often via the command line.

You can find more tools for managing Riak with a more user friendly, contemporary web user interface. The most popular of those tools is Basho’s own Riak Control.

Riak Control

Riak Control ships with Riak, and you can enable it to provide a more visually oriented interface to Riak’s inner workings.

Riak Control provides the following functionality out of the box:

More functionality is in the works. Check out the official documentation for Riak Control to learn more.

Performance testing

A popular tool for performance testing, bulk loading keys, and other testing odds and ends is Basho Bench.

Basho Bench can performance test different aspects of Riak, and then generate nice R based graphs as a final output option.

References