The internet is full of information and disputes over approaches to sql/nosql choice as well as benefits and drawbacks of this or that KV-storage. What you are reading right now is neither a rocksdb guide nor an attempt to rope you in using this very storage and my driver for it. I just want to share some preliminary results of the work done on streamlining NIF for Erlang development process. In this article you’ll have a look at a working driver for rocksdb developed in a couple of evenings.

So, in one of the projects I was faced with a challenge of reliable processing of big number of events. Every event takes from 50 up to 350 bytes and there are more than 80 mln events per node daily. It should be noted that this article doesn’t address fault tolerance of message delivery. Also, one more processing limitation is atomic and consistent change of a set of events.

So, the main requirements for a driver are:

  • Reliability
  • Performance
  • Minimal codebase
  • Safety
  • Functionality
  • All the main kv-functions
  • Column families
  • Transactions
  • Data compression
  • Support of flexible storage setup

Brief overview of current solutions

  • erocksdb — a solution from leofs developers. Its obvious benefit is testing on real project. However, the codebase is outdated, as well as there are no transactions. This driver is based on rocksdb 4.13.
  • rockse has a number of limitations, for instance, lack of configuration options. The main thing is that all keys and values must be string type. It’s been included in the overview as an example of a whole range of drivers which implement some functions while hindering the others.
  • erlang-rocksdb is a fully-functional project started in 2014. Like erocksdb, this one is used in real projects. It can boast a big codebase in C/C++ and a wide variety of functions. Erlang-rocksdb will do for general practice in most projects.

On reviewing the current situation with Erlang drivers for rocksdb it became obvious that none of them fully meets the project requirements. Of course, I could have easily used erlang-rocksdb. However, having a couple of free evenings and a bit of curiosity, I wondered if it was possible to meet all the current requirements and implement most of the functions in NIF in a short time. After successful development and implementation of Bloom filter in Rust such curiosity can be explained, can’t it?

Rocker

Rocker is NIF for Erlang which uses Rust binding for rocksdb. Its key features are safety, performance and a minimal codebase. The keys and data are kept binary and this doesn’t impose any restrictions on storage format. So far the project is suitable for being used in third-party solutions.

API overview

Open database

You can work with database in two modes:

  1. Default column family. In this mode all your keys are stored in the same set. Rocksdb makes it possible to fine-tune the storage flexibly according to your current tasks. Depending on them, database can be opened in two different ways:
    • using a standard set of options
      {ok, Db} = rocker:open_default(<<”/project/priv/db_default_path”>>).
      
      This operation will result in a pointer for working with database. The database itself will be blocked for any other attempts to open it. Immediately after clearing of this pointer it will be unlocked automatically.
    • fine-tuning options to your needs
      {ok, Db} = rocker:open(<<"/project/priv/db_path">>, #{
      create_if_missing => true,
      set_max_open_files => 1000,
      set_use_fsync => false,
      ...
      set_disable_auto_compactions => true,
      set_compaction_style => universal
      }).      
      
  2. Split into several column families. Keys are saved into the so-called column families, and every family might have various options. Let’s take a look at an example of database opening with standard options for all column families:
     {ok, Db} = case rocker:list_cf(BookDbPath) of
     {ok, CfList} ->
        rocker:open_cf_default(BookDbPath, CfList);
     _ ->
        CfList = [],
        rocker:open_default(BookDbPath)
     end.
    

Delete database

To delete database correctly rocker:destroy(Path) should be run while the database shouldn’t be used.

Recover database after a failure

In case of a system failure database can be recovered with the help of rocker:repair(Path). The process consists of 4 steps:

  1. file search
  2. recovering tables by WAL replaying
  3. metadata extraction
  4. writing of descriptor

Column family creation

rocker:create_cf_default(Db, <<”testcf1">>) -> ok.

Column family deletion

rocker:drop_cf(Db, <<”testcf”>>) -> ok.

CRUD operations

Data writing by key

rocker:put(Db, <<”key”>>, <<”value”>>) -> ok.

Data acquisition by key

rocker:get(Db, <<”key”>>) -> {ok, <<”value”>>} | notfound

Data deletion by key

rocker:delete(Db, <<”key”>>) -> ok.

Data writing by key within CF

rocker:put_cf(Db, <<”testcf”>>, <<”key”>>, <<”value”>>) -> ok.

Data acquisition by key within CF

rocker:get_cf(Db, <<”testcf”>>, <<”key”>>) -> {ok, <<”value”>>} | notfound

Data deletion by key within CF

rocker:delete_cf(Db, <<”testcf”>>, <<”key”>>) -> ok

Iterators

As you know, one of the basic principles of rocksdb is organized key storage. This feature is vital in real tasks and, to use it properly, we need data iterators. In rocksdb there are a few iteration modes. You can find the code samples in the tests: rocker_SUITE.erl

  • From table beginning. In Rocker the {'start'} iterator is responsible for that.
  • From table end: {'end'}
  • From a certain key forward {'from', Key, forward}
  • From a certain key reverse {'from', Key, reverse}

It should be noted that all these modes also work for iteration mode in column families.

Create iterator

rocker:iterator(Db, {'start'}) -> {ok, Iter}.

Check iterator

rocker:iterator_valid(Iter) -> {ok, true} | {ok, false}.

Create iterator for CF

rocker:iterator_cf(Db, Cf, {‘start’}) -> {ok, Iter}.

Create prefix iterator

While creating a database, any prefix iterator requires a clear indication of prefix length.

{ok, Db} = rocker:open(Path, #{
prefix_length => 3
}).

Here is an example of creating an iterator by prefix 'aaa'

{ok, Iter} = rocker:prefix_iterator(Db, <<"aaa">>).

Create prefix iterator for CF

Like the previous prefix iterator, this one needs the indication of prefix_length for column family.

{ok, Iter} = rocker:prefix_iterator_cf(Db, Cf, <<"aaa">>).

Get the following element

rocker:next(Iter) -> {ok, <<”key”>>, <<”value”>>} | ok

This method returns the following key/value or ok if the iterator was completed.

Transactions

It’s commonly required to simultaneously write the changes of a key set. Rocker allows us to unite CRUD operations both within a common set and in CF. The following example illustrates our work with transactions:

{ok, 6} = rocker:tx(Db, [
   {put, <<"k1">>, <<"v1">>},
   {put, <<"k2">>, <<"v2">>},
   {delete, <<"k0">>, <<"v0">>},
   {put_cf, Cf, <<"k1">>, <<"v1">>},
   {put_cf, Cf, <<"k2">>, <<"v2">>},
   {delete_cf, Cf, <<"k0">>, <<"v0">>}
]).

Performance

In a set of tests you can find a performance test. It demonstrates about 30k read RPS and 200k write RPS on my machine. In real conditions we might expect something about 15–20k read RPS and 120k write RPS with average amount of data being about 1 kB per key and the total number of keys exceeding 1 billion.

Conclusion

Rocker development and implementation in our project led to reducing the system response time and restart time as well as enhancing reliability. All the above-mentioned benefits were obtained with minimal development and implementation costs. Personally, I’m convinced that Rust is just perfect for any Erlang project in need of streamlining. While 95% of your code can be fast and efficiently implemented in Erlang, the remaining — and hindering — 5% can be rewritten or fine-tuned with the help of Rust. What’s more, it will not in the least decrease the system general reliability.