[erlang-questions] finding the nearest key in a key-value table

Dmitry Kolesnikov dmkolesnikov@REDACTED
Tue May 27 13:23:03 CEST 2014


Hello,

r-tree / kd-tree might be use-full for you to handle “nearest” lookup. 
however, I am not aware of efficient implementation on Erlang.

Another, the best possible “available” option for your problem domain is sorted trees.
It gives you log N complexity. If the key distance is multi-dimentional then z-ordered 
curve technique would help you to cluster “nearby” keys.

You might take a look into my modification of gb_trees that allows to execute range queries (via iterator).
https://github.com/fogfish/feta/blob/master/src/gbt.erl   

Therefore, nearest key lookup becomes a range lookup according criteria of your application.

nearest(Key, N, Tree) ->
   iterator_to_list(
      gbt:range(key_to_range(Key, N), Tree)
     ,[]
   ).

iterator_to_list(Iter0, Acc) ->
   case gbt:iterator(Iter) of
      none ->
         lists:reverse(Acc);
      {Key, Val, Iter} ->
         iterator_to_list(Iter, [{Key, Val}|Acc])
   end.

I’ve notice that 10M key might be large for Erlang data structures (e.g. I am trying to limit gb_trees to 1 - 2M in my apps) if you have many writes. Therefore, you might convert you lookup table to ets but same techniques is applicable to search nearest keys.

Best Regards, 
Dmitry



On 27 May 2014, at 13:58, Joe Armstrong <erlang@REDACTED> wrote:

> 
> What's the best data structure to support a "nearest key" operation?
> 
> I want a data structure that supports the following operations:
> 
>      new() -> NewTree
> 
>      insert(Key, Value, Tree) -> NewTree
>      
>     delete(Key, Tree)  -> NewTree
>    
>     nearest(Key, N, Tree)     -> [{Key1,Val1},{Key2,Val2} ...].
> 
>     Keys are 20 byte binaries, values are 6 byte binaries
> 
>     The maximum number of keys of Key-Value pairs is 10 million.
> 
>    nearest(Key, N, Tree) returns the N nearest items to Key in the tree
> 
>    "near" means that the keys are near to each other. Often the key I want will not
> be in the tree, so I want nearby keys. The nearness measure is just the
> "distance" between the two keys (ie the nearest N keys to the given key if the keys were to
> be sorted)
> 
> Are there any Erlang libraries that do this?
> 
> /Joe
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions




More information about the erlang-questions mailing list