[erlang-questions] Benchmarking Erlang: Deathmatch of gb_trees, dict, ets, mnesia ... and registered names
Joel Reymont
joelr1@REDACTED
Thu Oct 9 13:08:01 CEST 2008
I'm still trying to figure out how to optimize pid <-> integer mapping
and I thought I'd try several approaches. My goal is to lookup a
process id as quickly as possible given an integer.
My system is a Mac Pro 2x2.8Ghz Quad Xeon, 14Gb of memory
Erlang (BEAM) emulator version 5.6.3 [source] [smp:8] [async-threads:
0] [kernel-poll:false]
Here's my set of timings, code at the end of this message.
%% gb_trees
1> map1:test(10000).
Populate: 0.0972
Lookup: 0.0912
ok
2> map1:test(100000).
Populate: 0.8737
Lookup: 5.0007
ok
3> map1:test(1000000).
Populate: 9.9215
Lookup: 5.0010
ok
%% dict
1> map2:test(10000).
Populate: 0.1035
Lookup: 0.0730
ok
2> map2:test(100000).
Populate: 1.0407
Lookup: 1.2715
ok
3> map2:test(1000000).
Populate: 10.5010
Lookup: 5.0010
ok
%% ets
4> map3:test(10000).
Populate: 0.1140
Lookup: 0.0448
ok
5> map3:test(100000).
Populate: 1.3435
Lookup: 0.4669
ok
6> map3:test(1000000).
Populate: 11.6472
Lookup: 5.0860
ok
Dict seems to be the winner for 100K values. What's particularly
interesting to me is that gb_trees, dict and ets give take about the
same time to look up 1 mil. values. Is there an explanation?
map4, map5 and map6 that test mnesia ram_copies, disc_only_copies and
ram_copies in a 2-node setup. There seems to be no overhead compared
to ets, though. Can't explain this either.
The absolutely bizarre and surprising discovery are the timings for
registered names. On a hunch, I thought I'd try to register a process
under its id. This way I could just send messages to that id from any
node in the cluster, without having to go through a separate
translation process.
%% registered names
(2@REDACTED)2> map7:test(10000).
Populate: 0.8526
Lookup: 0.0040
ok
(2@REDACTED)1> map7:test(100000).
Populate: 8.8507
Lookup: 0.0605
ok
(2@REDACTED)3> map7:test(1000000).
Populate: 94.6030
Lookup: 0.8558
It seems that registering processes under their integer id is the way
to go. Am I right in my conclusion? Are there any pitfalls with going
this route?
Between players and games, I would most likely have <100K ids
registered at any given time.
Code in follow-up messages, registered names first, then gb_trees,
dict, ets and mnesia.
Thanks, Joel
--
wagerlabs.com
More information about the erlang-questions
mailing list