[erlang-questions] finding the nearest key in a key-value table

Tue May 27 14:39:50 CEST 2014

On 05/27/2014 12:58 PM, Joe Armstrong wrote:
> What's the best data structure to support a "nearest key" operation?
>
> I want a data structure that supports the following operations:
>
>      new() -> NewTree
>
>      insert(Key, Value, Tree) -> NewTree
>
>      delete(Key, Tree) -> NewTree
>
>      nearest(Key, N, Tree) -> [{Key1,Val1},{Key2,Val2} ...].
>
>      Keys are 20 byte binaries, values are 6 byte binaries
>
>      The maximum number of keys of Key-Value pairs is 10 million.
>
>     nearest(Key, N, Tree) returns the N nearest items to Key in the tree
>
>     "near" means that the keys are near to each other. Often the key I
> want will not
> be in the tree, so I want nearby keys. The nearness measure is just the
> "distance" between the two keys (ie the nearest N keys to the given key
> if the keys were to
> be sorted)
>
> Are there any Erlang libraries that do this?

Joe,

You have not really told us the most interesting aspect of your 
requirements. (Although one would be tempted to make an educated guess 
knowing you.)

Do you want your data structure to be concurrent or not?

If you want a concurrent such data structure, I am afraid you need to do 
something (much) more complicated than the code you posted a while ago...

Kostis

<shameless_plug>
   Incidentally, we have done some very recent work on measuring the 
scalability (or lack thereof) of concurrent ETS tables of type 
ordered_set and propose to base the implementation on a different, much 
more scalable data structure which dynamically adapts to contention.  To 
those interested, we can send a paper privately.
</shameless_plug>