[erlang-questions] Fwd: Re: finding the nearest key in a key-value table
Benoit Chesneau
bchesneau@REDACTED
Tue May 27 16:32:07 CEST 2014
---------- Forwarded message ----------
From: "Benoit Chesneau" <bchesneau@REDACTED>
Date: May 27, 2014 4:31 PM
Subject: Re: [erlang-questions] finding the nearest key in a key-value table
To: "Dmitry Kolesnikov" <dmkolesnikov@REDACTED>
Cc:
On May 27, 2014 1:23 PM, "Dmitry Kolesnikov" <dmkolesnikov@REDACTED> wrote:
>
> Hello,
>
> r-tree / kd-tree might be use-full for you to handle “nearest” lookup.
> however, I am not aware of efficient implementation on Erlang.
>
here is an implementation of an rtree in erlang. not sure how much
efficient it is but it works.
> Another, the best possible “available” option for your problem domain is
sorted trees.
> It gives you log N complexity. If the key distance is multi-dimentional
then z-ordered
> curve technique would help you to cluster “nearby” keys.
>
> You might take a look into my modification of gb_trees that allows to
execute range queries (via iterator).
> https://github.com/fogfish/feta/blob/master/src/gbt.erl
>
> Therefore, nearest key lookup becomes a range lookup according criteria
of your application.
>
> nearest(Key, N, Tree) ->
> iterator_to_list(
> gbt:range(key_to_range(Key, N), Tree)
> ,[]
> ).
>
> iterator_to_list(Iter0, Acc) ->
> case gbt:iterator(Iter) of
> none ->
> lists:reverse(Acc);
> {Key, Val, Iter} ->
> iterator_to_list(Iter, [{Key, Val}|Acc])
> end.
>
> I’ve notice that 10M key might be large for Erlang data structures (e.g.
I am trying to limit gb_trees to 1 - 2M in my apps) if you have many
writes. Therefore, you might convert you lookup table to ets but same
techniques is applicable to search nearest keys.
>
> Best Regards,
> Dmitry
>
>
>
> On 27 May 2014, at 13:58, Joe Armstrong <erlang@REDACTED> wrote:
>
> >
> > What's the best data structure to support a "nearest key" operation?
> >
> > I want a data structure that supports the following operations:
> >
> > new() -> NewTree
> >
> > insert(Key, Value, Tree) -> NewTree
> >
> > delete(Key, Tree) -> NewTree
> >
> > nearest(Key, N, Tree) -> [{Key1,Val1},{Key2,Val2} ...].
> >
> > Keys are 20 byte binaries, values are 6 byte binaries
> >
> > The maximum number of keys of Key-Value pairs is 10 million.
> >
> > nearest(Key, N, Tree) returns the N nearest items to Key in the tree
> >
> > "near" means that the keys are near to each other. Often the key I
want will not
> > be in the tree, so I want nearby keys. The nearness measure is just the
> > "distance" between the two keys (ie the nearest N keys to the given key
if the keys were to
> > be sorted)
> >
> > Are there any Erlang libraries that do this?
> >
> > /Joe
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://erlang.org/mailman/listinfo/erlang-questions
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140527/c0d9536e/attachment.htm>
More information about the erlang-questions
mailing list