[erlang-questions] Benchmarking Erlang: Deathmatch of gb_trees, dict, ets, mnesia ... and registered names

Thu Oct 9 14:17:49 CEST 2008

Joel Reymont wrote:
> Dict seems to be the winner for 100K values. What's particularly  
> interesting to me is that gb_trees, dict and ets give take about the  
> same time to look up 1 mil. values. Is there an explanation?

Joel, one of the problems with benchmarking is to be sure what it
is that you measure. Your code for "populate" uses gen_server:cast,
which basically means that what you measure is message passing time.
Furthermore, for large numbers of messages, you start running the
lookup test before the server has had time to process its huge
in-queue, hence the strangely similar numbers 5.0xxx for most tests.
The hint is that your numbers _should_ scale at least linearly with
the number of elements, and that the lookup time should be the same
for both 100k and 1000k elements is completely unreasonable.
(The confirmation for me was that even though the test had completed,
my cpu was still working like crazy until I shut down the VM again;
the server was still processing a million backed up ADD-requests.)

Changing cast to call in your code, I get this for dict and gb_trees:
----
map1:test(10000).
Populate: 0.1408
Lookup:   0.0497

map1:test(100000).
Populate: 1.9648
Lookup:   0.5244

map1:test(1000000).
Populate: 34.2963
Lookup:   5.2909
----
map2:test(10000).
Populate: 0.1187
Lookup:   0.0519

map2:test(100000).
Populate: 0.9676
Lookup:   0.5331

map2:test(1000000).
Populate: 40.2369
Lookup:   5.8442
----

These numbers make more sense: they scale with the number of elements,
and gb_trees wins over dict.

    /Richard