[erlang-questions] [ANN] jch-erl 0.1 - Fast minimal memory consistent hashing for Erlang/OTP.

Sverker Eriksson <>
Tue Nov 11 14:41:08 CET 2014


Maybe not worth a pint, but...
If you want to ensure the first argument can be 64 bit long
you should use enif_get_uint64.
Type 'long' is only 64 bit on most 64bit OS (not Windows).

/Sverker, Erlang/OTP


On 11/07/2014 10:08 PM, Darach Ennis wrote:
> Hey folk.
>
> jch-erl 0.1 freshly baked.
>
> NIF wrapper for Jump Consistent Hash algorithm by John Lamping and Eric
> Veach developed at Google, Inc. Paper: "A Fast, Minimal Memory, Consistent
> Hash Algorithm".
>
> http://arxiv.org/ftp/arxiv/papers/1406/1406.2294.pdf
>
> This implementation uses the xorshift64* pseudo-random number generator
> rather than the linear congruential generator in the paper as it is
> reasonably fast but, more importantly, memory efficient.
>
> Usage
>
> A single function jch:ch/2 is offered. Simply pass in a 64 bit long (or
> less) integer key argument followed by the desired bucket or continuum
> partition size. The function returns the partition allocated for the key.
>
> Performance is very stable as bucket size increases and distribution across
> the ring is stable (standard deviation for a reasonable sample size is
> typically <2% relative to the mean).
>
> Example
>
> % ERL_LIBS=deps erl +sfwi 1 +scl false -pa ebin
> Erlang R16B02 (erts-5.10.3) [source] [64-bit] [smp:8:8] [async-threads:10]
> [hipe] [kernel-poll:false] [dtrace]
>
> Eshell V5.10.3  (abort with ^G)
> 1> jch:ch(1,128).
> 29
> 2> jch:ch(13,128).
> 121
> 3> jch:ch(13,128).
> 121
> 4> jch:ch(trunc(math:pow(2,64))-1,128).
> 78
> 5> jch:ch(trunc(math:pow(2,64)),128). %% off by 1 mate
> ** exception error: bad argument
>       in function  jch:ch/2
>          called as jch:ch(18446744073709551616,128)
> 6> %% off by 1 mate
> 6>
>
> Access times are typically sub-microsecond. Performance tests of the erlang
> NIF
> and underlying C function are included.
>
> Example performance for a space of 1 billion buckets:
>
> - 1B Buckets. Hash performance.
> ch1b N: 10 Min: -----1 Max: -----2 Median: -----1 Avg: -----1 Elapsed: 16
> ch1b N: 100 Min: -----0 Max: -----1 Median: -----1 Avg: -----1 Elapsed: 102
> ch1b N: 1000 Min: -----0 Max: -----9 Median: -----0 Avg: -----0 Elapsed:
> 1178
> ch1b N: 10000 Min: -----0 Max: ----13 Median: -----0 Avg: -----0 Elapsed:
> 10146
> ch1b N: 100000 Min: -----0 Max: ----59 Median: -----0 Avg: -----0 Elapsed:
> 109772
> ch1b N: 1000000 Min: -----0 Max: ---886 Median: -----0 Avg: -----0 Elapsed:
> 1048561
> ch1b N: 10000000 Min: -----0 Max: --3734 Median: -----0 Avg: -----0
> Elapsed: 11757297
>
> And a sample of how keys partition over the space:
>
> - 32 Buckets. 10M hashes Uniform Distribution Check.
> 312292 312457 312660 312522 313805 311987 311431 312299
> 311975 312834 312233 312125 313391 312222 312170 312389
> 311811 313134 312986 312434 312002 313060 312784 312699
> 312114 312170 312385 312721 313163 311883 313060 312802
> oOo| Min: 311431 Max: 313805 Median: 312389 Avg: 312500 Elapsed: 15748372
> Worst: 99.2435 Med: 99.5488 Avg: 99.5841 RSD: 0.9139
>
> The RSD here is the standard deviation relative to the mean (avg) which is
> less than 1%
> after 10M hash operations on random uniform test data. That's pretty good.
>
> A pint (or fancy fizzy water, or cup of pant wetting tea...) to anyone who
> can strip off an order of magnitude off the max outlier latencies and
> explain through scheduler, gc or emulator tuning what insight I missed or
> runtime vagary i've yet to master. Mainly written to toy with
> micro-benchmarking and get practice with tuning the emulator.
>
> Enjoy!
>
> Cheers,
>
> Darach.
>
>
>
> _______________________________________________
> erlang-questions mailing list
> 
> http://erlang.org/mailman/listinfo/erlang-questions

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20141111/f02d8041/attachment.html>


More information about the erlang-questions mailing list