[erlang-questions] rbdict - A dictionary as a Red-Black tree

Robert Virding <>
Mon Jun 16 21:43:53 CEST 2008


2008/6/16 Jacob Perkins <>:

> I ran the following code thru fprof using N=1000 for Mod=dict and
> Mod=rbdict. The basic results were that rbdict was ~10-20ms slower than dict
> for both store and fetch.
>
> store(Mod, N) ->
>     Fold = fun(I, D) -> Mod:store(I, I, D) end,
>     lists:foldl(Fold, Mod:new(), lists:seq(1, N)).
>
> fetch(Mod, N, Dict) ->
>     Map = fun(I) -> Mod:fetch(I, Dict) end,
>     lists:map(Map, lists:seq(1, N)).


This is expected, rb-trees are O(n lg n) but dict is based on a hashing
algorithm as is almost O(1).

Here are some more comparisons done on a 10k element random list:

           Insert      Fetch       Update      Delete

dict        31.8        6.64        22.345      21.09
rbdict      28.36       8.435       27.11       21.795
gb_trees    22.105      9.61        17.735      15.235
            30.94                   28.825      24.845

Pardon the alignment but it looks ok with constant width.

Gb_trees have two values each for inserting/updating/deletion as it has two
functions for each depending on whether you know if the element is in the
tree or not. It is faster if you know. The dict modules don't have this
separation in the interface.

The benefit of using rbdict (and gb_trees) is that it is a much lighter
structure. It is also ordered but there is no interface for this. Which one
you choose depends of course on your application and it is easy to use the
different modules and measure. Even testing with gb_trees is easy as the
interfaces are close.

Robert
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20080616/4bb8b975/attachment.html>


More information about the erlang-questions mailing list