[erlang-questions] finding the nearest key in a key-value table

Joe Armstrong erlang@REDACTED
Tue May 27 15:37:33 CEST 2014


I just tried a simple experiment

I made a 1M table of random IP adresses.
Assume we ask the 20 nearest machines where there 20 nearest machine are
this is 20 RPCs and we put the results (400 of them) back in the table
to update the table with 400 entries and find the twenty nearest in the
table took
10ms with a very naive ordered_sets lookup. This is fast enough for now ...

/Joe



On Tue, May 27, 2014 at 3:26 PM, Dmitry Kolesnikov
<dmkolesnikov@REDACTED>wrote:

> Yes, I mean k-buckets from Kademlia.
> From my perspective, a node eviction policy is an application specific
> task.
> I see reasons why 20 is chosen for Kademlia but any one can choose other
> eviction methods using memory limits or something else.
>
> I’ve used gb_trees + lists (using keyXXX methods) to implement k-bucket
> with k=20.
> The read/write access time was acceptable for me. However, some other data
> structure
> is required to implement it for scalable k. Maybe nested trees or dict /
> tree symbiosis.
>
> - Dmitry
>
>
>
> On 27 May 2014, at 16:12, Joe Armstrong <erlang@REDACTED> wrote:
>
> You mean K as in Kademlia ? - I thought about this
> In Kademlia K is 20 so we store 20 IPs per bit of a 160 bit sha1 hash ie
>  3200 IPs this takes 12.8KB
>
> When a bucket if full (in Kademlia) we throw away the least recently
> active entry in the bucket concerned- it seems to me that
> there is no reason to throw away anything until we have a really large
> number of machines.
>
> If there we 10M machine we'd need 40MB to store the IPs which is still <<
> 1GB - in other words a DHT
> with 10M machines only needs a 40MB cache of IPs ...
>
> /Joe
>
>
>
> On Tue, May 27, 2014 at 3:00 PM, Dmitry Kolesnikov <dmkolesnikov@REDACTED
> > wrote:
>
>>
>> On 27 May 2014, at 15:56, Jesper Louis Andersen <
>> jesper.louis.andersen@REDACTED> wrote:
>>
>>
>> On Tue, May 27, 2014 at 2:55 PM, Joe Armstrong <erlang@REDACTED> wrote:
>>
>>> Now why would anybody want to put 20 byte binaries in a hash table and
>>> find the K nearest hits :-)
>>
>>
>> Distributed Hash Table rings ;)
>>
>>
>> In this case, k-bucket is the right data structure to keep nodes.
>>
>>
>>
>> --
>> J.
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140527/b122e5f9/attachment.htm>


More information about the erlang-questions mailing list