[erlang-questions] Fwd: Re: finding the nearest key in a key-value table

Benoit Chesneau bchesneau@REDACTED
Tue May 27 16:32:07 CEST 2014


---------- Forwarded message ----------
From: "Benoit Chesneau" <bchesneau@REDACTED>
Date: May 27, 2014 4:31 PM
Subject: Re: [erlang-questions] finding the nearest key in a key-value table
To: "Dmitry Kolesnikov" <dmkolesnikov@REDACTED>
Cc:


On May 27, 2014 1:23 PM, "Dmitry Kolesnikov" <dmkolesnikov@REDACTED> wrote:
>
> Hello,
>
> r-tree / kd-tree might be use-full for you to handle “nearest” lookup.
> however, I am not aware of efficient implementation on Erlang.
>

here is an implementation of an rtree in erlang. not sure how much
efficient it is but it works.

> Another, the best possible “available” option for your problem domain is
sorted trees.
> It gives you log N complexity. If the key distance is multi-dimentional
then z-ordered
> curve technique would help you to cluster “nearby” keys.
>
> You might take a look into my modification of gb_trees that allows to
execute range queries (via iterator).
> https://github.com/fogfish/feta/blob/master/src/gbt.erl
>
> Therefore, nearest key lookup becomes a range lookup according criteria
of your application.
>
> nearest(Key, N, Tree) ->
>    iterator_to_list(
>       gbt:range(key_to_range(Key, N), Tree)
>      ,[]
>    ).
>
> iterator_to_list(Iter0, Acc) ->
>    case gbt:iterator(Iter) of
>       none ->
>          lists:reverse(Acc);
>       {Key, Val, Iter} ->
>          iterator_to_list(Iter, [{Key, Val}|Acc])
>    end.
>
> I’ve notice that 10M key might be large for Erlang data structures (e.g.
I am trying to limit gb_trees to 1 - 2M in my apps) if you have many
writes. Therefore, you might convert you lookup table to ets but same
techniques is applicable to search nearest keys.
>
> Best Regards,
> Dmitry
>
>
>
> On 27 May 2014, at 13:58, Joe Armstrong <erlang@REDACTED> wrote:
>
> >
> > What's the best data structure to support a "nearest key" operation?
> >
> > I want a data structure that supports the following operations:
> >
> >      new() -> NewTree
> >
> >      insert(Key, Value, Tree) -> NewTree
> >
> >     delete(Key, Tree)  -> NewTree
> >
> >     nearest(Key, N, Tree)     -> [{Key1,Val1},{Key2,Val2} ...].
> >
> >     Keys are 20 byte binaries, values are 6 byte binaries
> >
> >     The maximum number of keys of Key-Value pairs is 10 million.
> >
> >    nearest(Key, N, Tree) returns the N nearest items to Key in the tree
> >
> >    "near" means that the keys are near to each other. Often the key I
want will not
> > be in the tree, so I want nearby keys. The nearness measure is just the
> > "distance" between the two keys (ie the nearest N keys to the given key
if the keys were to
> > be sorted)
> >
> > Are there any Erlang libraries that do this?
> >
> > /Joe
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://erlang.org/mailman/listinfo/erlang-questions
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140527/c0d9536e/attachment.htm>


More information about the erlang-questions mailing list