[erlang-questions] [ANN] LETS - LevelDB-based Erlang Term Storage v0.5.3

Wed Nov 23 00:30:50 CET 2011

On Mon, Nov 21, 2011 at 13:59, Joseph Norton <norton@REDACTED> wrote:
>
> Ciprian -
>
> For encoding and then returning keys and values to the Erlang virtual machine, leveldb::Slice() is sufficient.  I didn't run any comparison tests but my assumption is that using std::string() requires (should require?) an extra memory allocation, copy, and deallocation for the std:string() object itself.
>
> Joe N.


    Hy again!

    So on my Go bindings I've done a small benchmark: implementing
`get` in terms of the `Get(Slice&,std::string*)` or in terms of
`NewIterator() / Seek(Slice&) / Compare(Slice&)`, and I've got quite
some interesting results:
    * on small sets (100k) it seems that if the key exists there is no
sensible performance difference;
    * but on large sets (1m) the impact is about 2x;
    * and it gets worse when the key does not exist; (the performance
drops to about a couple of ops per second...)

    My experiment was as follows:
    * step1) put 1m pairs composed of little endian unsigned 64 bit
key / value (the key is from 0 to 1m, and the value is key squared);
    * step2) get 1m pairs;
    * step3) delete those which `key & pattern == 0`;
    * step4) re-get 1m and verify if they should exist and what they hold
    * I use little endian to mix the keys a little bit;
    * I do not reopen the database between the four steps; I do reopen
the database for each experiment;
    * all experiments are done over tmpfs (without swapping) and each
experiment starts with a fresh database;
    * the values are computed as dividing the total number of
operations with the overall time; (the actual speed varies over time
as result of the workload pattern...)
    * in the case of the re/get experiment I don't let it run more
than 20 seconds;
    * (take into account that the benchmark is "driven" by Go and it
has some overhead, but the Go call path is identical in both setups,
thus it doesn't influence the trend;)
    * the delete speed varies as I only count a delete when I do it,
but I still need to go through the entire key range;

    Results:
~~~~
# get as `NewIterator()/Seek()/Compare()`
del-pattern | put/s | get/s | del/s | reget/s
0x00 -- all |   47k |   17k |   34k |       4
0x10 -- 50% |   50k |   17k |   36k |     200
0x70 -- 14% |   48k |   17k |   38k |     762
0xf0 --  6% |   50k |   17k |   46k |    1928
0xf... none |   50k |   17k |   --  |   17k
~~~~
# get as `Get(Slice&,std::string*)`
del-pattern | put/s | get/s | del/s | reget/s
0x00 -- all |   48k |   44k |   42k |   47k
0x10 -- 50% |   49k |   43k |   64k |   48k
0x70 -- 14% |   38k |   43k |   60k |   42k
0xf0 --  6% |   49k |   42k |   31k |   45k
0xf... none |   37k |   43k |   --  |   43k
~~~~

    Hope you find it useful,
    Ciprian.