[erlang-questions] How would you implement a blob store

Jay Nelson jay@REDACTED
Fri Jun 13 21:03:57 CEST 2014


Joe wrote:

> I want some opinions on how to implement a blob store.
> I want a simple key-value store.

> The *simplest* way I can think of is to use the file store
> a blob with (hex) hash "a2e34a32..." gets stored in 2-level directory
> structure in a file called a2/e3/a2e34a32

> Even this might have problems - for example is file:write_file/2 atomic?

This is typically done by writing to a scratch area with a unique
id, then moving the file on top of the existing or non-existing
file with the destination key path. Move is typically atomic.

>
> The next simplest way I can think of is to make a single huge blob store
> file (max 56GB) and use an ets table to map hashes to addresses in the file -
> if this is a good idea or not would depend upon how well the host OS
> handles sparse files and so on.

The old-fashioned fixed-record length approach to data base implementation.

——————

You might want to look at https://github.com/krestenkrab/vbisect which
is an implementation of a binary dictionary with variable-length binary
keys and variable-length binary values. I am working on making the
dictionary easily interchangeable with other erlang dictionaries and adding
common_test + PropEr testing to validate the implementation in
the ‘cleanup’ branch. I have future plans for principled blob/object
mapping to enable pluggable storage in https://github.com/jaynel/vbisect.
It is easily compatible with both rebar and erlang.mk
(especially since it is just a single .erl file).

The nice thing about a vbisect dictionary is that it is a single erlang
binary and uses binary pattern-matching internally to find values. It
contains a sorted tree of the keys and can find keys at the rate of
> 25K/sec on my Early 2011 MacBook Pro quad core i7. Because
the dictionaries are single binaries they can be shared by multicore,
messaged easily to other processes on the same node, are
true persistent data structures, and are easily read and written from
files (even from within a larger file containing a collection of them).
They do have the drawback of expensive modification, which could
become large with large dictionaries. We have not tried to optimize
modification yet, but I’m sure there are some ways to make it efficient
using sub-binaries and pattern-matching carefully as it currently does.

Kresten designed vbisect as an internal data structure to use inside
Hanoi DB which is his erlang-only eleveldb replacement
(https://krestenkrab/hanoidb) which might work as a solution as well.

Leveraging Dmitry’s suggestion of partitioning the keyspace and
using a separate vbisect dictionary for each allows for better distribution,
concurrency and faster lookup, as well as smaller dictionaries to avoid
the modification cost.

Fiddling with a vbisect is probably the simplest solution since it looks
like a binary blob and all the source is just a few hundred lines in a single file.

jay




More information about the erlang-questions mailing list