[erlang-questions] How would you implement a blob store

Scott Lystig Fritchie fritchie@REDACTED
Wed Jun 18 06:11:41 CEST 2014


Hi, Joe.  You've probably moved on to 7 different project since you
started this thread, but perhaps the Erlang User Conference slowed you
down?  ^_^

Joe Armstrong <erlang@REDACTED> wrote:

ja>   - The values are variable size binaries (max 56 KB)
ja>   - The keys are SHA1 hashes of the values
ja>   - I want to store max 1M blobs
ja>   - Efficiency is not a concern (...)

ja> The *simplest* way I can think of is to use the file store [...]

Your proposed addition of the additional subdirectory layers is
overkill, when considering both your "simplest" desire (or perhaps just
"simple") and also today's file systems: most are going to do just fine
with 1 million files in a single subdirectory.

If you're worried about data integrity on disk at all times, then you'll
have to do do something like this untested code:

    ok = file:write_file(Path ++ ".tmp", FileContents),
    ok = file:rename(Path ++ ".tmp", Path)

... and then when reading the file's contents, compare the hash to the
filename (which you'd said *is* the hash of the file's contents).

Bitcask would do just fine, though for such a small amount of data, the
speed gained by having all keys in Bitcask's keydir and file write speed
by merely appending your <56KByte objects to a big file ... probably
won't gain you much.

-Scott



More information about the erlang-questions mailing list