[erlang-questions] [ANN] LETS - LevelDB-based Erlang Term Storage v0.5.3
Ciprian Dorin Craciun
ciprian.craciun@REDACTED
Wed Nov 23 03:32:33 CET 2011
On Wed, Nov 23, 2011 at 01:30, Ciprian Dorin Craciun
<ciprian.craciun@REDACTED> wrote:
> On Mon, Nov 21, 2011 at 13:59, Joseph Norton <norton@REDACTED> wrote:
>>
>> Ciprian -
>>
>> For encoding and then returning keys and values to the Erlang virtual machine, leveldb::Slice() is sufficient. I didn't run any comparison tests but my assumption is that using std::string() requires (should require?) an extra memory allocation, copy, and deallocation for the std:string() object itself.
>>
>> Joe N.
Small observation: I've quickly hacked the LevelDB implementation
so that it takes `Slice` as a value (all the way down to the core
where it is actually copied to) thus I've saved the `std::string`
allocation.
But in my preliminary benchmarks, I see no difference in
performance. (Maybe I'm throttled by the test harness...)
Ciprian.
> Hy again!
>
> So on my Go bindings I've done a small benchmark: implementing
> `get` in terms of the `Get(Slice&,std::string*)` or in terms of
> `NewIterator() / Seek(Slice&) / Compare(Slice&)`, and I've got quite
> some interesting results:
> * on small sets (100k) it seems that if the key exists there is no
> sensible performance difference;
> * but on large sets (1m) the impact is about 2x;
> * and it gets worse when the key does not exist; (the performance
> drops to about a couple of ops per second...)
>
> My experiment was as follows:
> * step1) put 1m pairs composed of little endian unsigned 64 bit
> key / value (the key is from 0 to 1m, and the value is key squared);
> * step2) get 1m pairs;
> * step3) delete those which `key & pattern == 0`;
> * step4) re-get 1m and verify if they should exist and what they hold
> * I use little endian to mix the keys a little bit;
> * I do not reopen the database between the four steps; I do reopen
> the database for each experiment;
> * all experiments are done over tmpfs (without swapping) and each
> experiment starts with a fresh database;
> * the values are computed as dividing the total number of
> operations with the overall time; (the actual speed varies over time
> as result of the workload pattern...)
> * in the case of the re/get experiment I don't let it run more
> than 20 seconds;
> * (take into account that the benchmark is "driven" by Go and it
> has some overhead, but the Go call path is identical in both setups,
> thus it doesn't influence the trend;)
> * the delete speed varies as I only count a delete when I do it,
> but I still need to go through the entire key range;
>
> Results:
> ~~~~
> # get as `NewIterator()/Seek()/Compare()`
> del-pattern | put/s | get/s | del/s | reget/s
> 0x00 -- all | 47k | 17k | 34k | 4
> 0x10 -- 50% | 50k | 17k | 36k | 200
> 0x70 -- 14% | 48k | 17k | 38k | 762
> 0xf0 -- 6% | 50k | 17k | 46k | 1928
> 0xf... none | 50k | 17k | -- | 17k
> ~~~~
> # get as `Get(Slice&,std::string*)`
> del-pattern | put/s | get/s | del/s | reget/s
> 0x00 -- all | 48k | 44k | 42k | 47k
> 0x10 -- 50% | 49k | 43k | 64k | 48k
> 0x70 -- 14% | 38k | 43k | 60k | 42k
> 0xf0 -- 6% | 49k | 42k | 31k | 45k
> 0xf... none | 37k | 43k | -- | 43k
> ~~~~
>
> Hope you find it useful,
> Ciprian.
>
More information about the erlang-questions
mailing list