[erlang-questions] Fwd: Re: finding the nearest key in a key-value table

OvermindDL1 overminddl1@REDACTED
Tue May 27 22:08:45 CEST 2014


On May 27, 2014 8:32 AM, "Benoit Chesneau" <bchesneau@REDACTED> wrote:
>
> ---------- Forwarded message ----------
> From: "Benoit Chesneau" <bchesneau@REDACTED>
> Date: May 27, 2014 4:31 PM
> Subject: Re: [erlang-questions] finding the nearest key in a key-value
table
> To: "Dmitry Kolesnikov" <dmkolesnikov@REDACTED>
> Cc:
>
>
> On May 27, 2014 1:23 PM, "Dmitry Kolesnikov" <dmkolesnikov@REDACTED>
wrote:
> >
> > Hello,
> >
> > r-tree / kd-tree might be use-full for you to handle “nearest” lookup.
> > however, I am not aware of efficient implementation on Erlang.
> >
>
> here is an implementation of an rtree in erlang. not sure how much
efficient it is but it works.
>
> > Another, the best possible “available” option for your problem domain
is sorted trees.
> > It gives you log N complexity. If the key distance is multi-dimentional
then z-ordered
> > curve technique would help you to cluster “nearby” keys.
> >
> > You might take a look into my modification of gb_trees that allows to
execute range queries (via iterator).
> > https://github.com/fogfish/feta/blob/master/src/gbt.erl
> >
> > Therefore, nearest key lookup becomes a range lookup according criteria
of your application.
> >
> > nearest(Key, N, Tree) ->
> >    iterator_to_list(
> >       gbt:range(key_to_range(Key, N), Tree)
> >      ,[]
> >    ).
> >
> > iterator_to_list(Iter0, Acc) ->
> >    case gbt:iterator(Iter) of
> >       none ->
> >          lists:reverse(Acc);
> >       {Key, Val, Iter} ->
> >          iterator_to_list(Iter, [{Key, Val}|Acc])
> >    end.
> >
> > I’ve notice that 10M key might be large for Erlang data structures
(e.g. I am trying to limit gb_trees to 1 - 2M in my apps) if you have many
writes. Therefore, you might convert you lookup table to ets but same
techniques is applicable to search nearest keys.
> >
> > Best Regards,
> > Dmitry
> >
> >
> >
> > On 27 May 2014, at 13:58, Joe Armstrong <erlang@REDACTED> wrote:
> >
> > >
> > > What's the best data structure to support a "nearest key" operation?
> > >
> > > I want a data structure that supports the following operations:
> > >
> > >      new() -> NewTree
> > >
> > >      insert(Key, Value, Tree) -> NewTree
> > >
> > >     delete(Key, Tree)  -> NewTree
> > >
> > >     nearest(Key, N, Tree)     -> [{Key1,Val1},{Key2,Val2} ...].
> > >
> > >     Keys are 20 byte binaries, values are 6 byte binaries
> > >
> > >     The maximum number of keys of Key-Value pairs is 10 million.
> > >
> > >    nearest(Key, N, Tree) returns the N nearest items to Key in the
tree
> > >
> > >    "near" means that the keys are near to each other. Often the key I
want will not
> > > be in the tree, so I want nearby keys. The nearness measure is just
the
> > > "distance" between the two keys (ie the nearest N keys to the given
key if the keys were to
> > > be sorted)
> > >
> > > Are there any Erlang libraries that do this?
> > >
> > > /Joe
> > >

Unsure how efficient you could make it in erlang, pretty efficient I think,
it sounds like what you want is a Patricia Trie.  It by default gets the
node you ask for or the nearest matching one with the longest prefix of
whatever accuracy you wish, a minor change could just have it go down
nearby existing nodes to get the nearest match instead of just the matching
based on prefix, especially if you statically know the length of the
binaries.

_______________________________________________
> > > erlang-questions mailing list
> > > erlang-questions@REDACTED
> > > http://erlang.org/mailman/listinfo/erlang-questions
> >
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://erlang.org/mailman/listinfo/erlang-questions
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140527/c2d418ed/attachment.htm>


More information about the erlang-questions mailing list