Introducing Rump: Hot-sync two Redis databases using dumps

We're thrilled to announce our first open source project: Rump!

Rump is a tiny little tool focused on one simple thing: getting live data out of an AWS ElastiCache Redis cluster.

We faced this problem when trying to get our staging Redis containers in sync with our production cluster. At Sticker Mule we heavily use Docker and CoreOS, relying on an ElastiCache cluster for our Redis needs in production.

Lately we wanted to make our staging environment as close as possible to our production environment, and Redis is part of it. Here's the journey that ultimately led to Rump.

Don't block

We had one simple requirement: do not block production while getting data. The single-threadedness of Redis is an important aspect to take into account.

Surprisingly we discovered that ElastiCache ships with some commands disabled. Basically all commands you can use to safely transfer data.

Rump Redis Sync

BGSAVE

The standard way of manually triggering a back up of a Redis database is issuing a BGSAVE, and waiting for it to finish in the background, a non-blocking operation. Unfortunately this is disabled, unless you go with the AWS internal implementation of the snapshot feature.

SLAVEOF

Setting up slaves is another interesting option Redis offers, and it would have been the perfect choice for us.

The plan was to set temporarily the staging Redis containers as slaves of our production cluster, getting live data. Unluckily SLAVEOF too is disabled, there's no way to add slaves to an ElastiCache instance.

Existent tools

There are many awesome Redis tools around that try to simplify the administration of Redis servers, dumping to JSON, etc.

The problem is that most of the stable, maintained tools, use the KEYS command to get keys, and then operate on the keys. The KEYS command has an O(N) complexity, heavily blocking Redis when N is high, until all keys are returned. Staging containers get created and destroyed frequently and we have a good number of keys, we don't want to DoS our own server.

rump-logo-600w

It was clear we needed a simple tool to just do the sync. We started playing with SCAN to get the keys, and DUMP/RESTORE to get/set values.

SCAN is an O(1) command, safe to run on a production server to get all keys, and because of that its implementation has to be different than KEYS. SCAN returns a group of keys currently present in the DB, and a cursor to the next group.

DUMP/RESTORE make the job of reading/writing values independent from the key type.

With this in mind, here's what Rump brings to the table:

  • Non-blocking progressive keys reading via SCAN.
  • TYPE independent values operations via DUMP/RESTORE.
  • Pipelined SCAN and DUMP/RESTORE operations.
  • Reading from the source server and writing to the destination server are concurrent. Rump doesn't store all keys before writing to the destination server.
  • Single cross-platform binary, no dependencies.
  • Minimal footprint, UNIX philosophy, it does just one thing with two flags.

We hope the tool will be useful to those experiencing the same troubles we had, and many, many thanks to Redis for supporting such a wide array of commands!

P.S. If this is of interest to you, consider joining our team or send us a message.

Like this post? Subscribe via Twitter or RSS.