Some new mnesia benchmarking results
Ulf Wiger
etxuwig@REDACTED
Fri Nov 9 15:28:08 CET 2001
On Wed, 7 Nov 2001, Per Bergqvist wrote:
>Hi Sean,
>
>agree that mnesia is now getting ready for prime time as long
>as you have small records ;-). (Results are even more
>impressive on my little 1.7 GHz P4 Linux box )
>
>I have a nasty problem that I still haven't resolved. I you
>have more and larger records than you have available RAM the
>system will get on its knees.
>
>I have tried to use disk_only_copies but it after 1 hour it had
>only written ~1M records so I aborted it. This should be
>compared with the 15Krecords/sec i get with disk_copies.
>
>Does anyone have a solution on how to store 10Mx2K records with
>mnesia without buying 20GB RAM ?
>
>/Per
What are your access patterns like?
I have some ideas that I'm testing currently.
1. Dan and Håkan have written a module in mnesia called
mnesia_frag.erl. It's wholly undocumented, except for the
source, but it seems to work really well. The idea is that
you can treat a number of regular mnesia tables as fragments
of a larger table. Operations on the "base table" access
the appropriate fragment based on a hashing function on
the key. Fragments are distributed evenly across a pool
of processors.
2. I have written (but not tested much yet) a modified
mnesia_frag that also supports static distribution of
fragments using a callback function to identify each
fragment instead of using a hash.
3. One could imagine splitting a very large disc_only table
into several smaller fragments, even on a single node
system. This wouldn't be of much use if the fragments are
replicated, and the application requires the tables to be
fully synched before they can be used (I've found this to
be a logical way to do things in real systems.) When synching
replicas, mnesia will read all objects into memory and pass
them to the other node in regular Erlang messages. This is
done for one table at a time.
4. Onto my most recent experiment: I've modified mnesia to
load a configurable number of tables in parallel. This
gave particularly good results on many disc_only_copies
(in one test, I reduced the startup time by half.)
In combination with (3) above, this might be of some
help to you. I also wouldn't mind assistance in verifying
that my patch doesn't jeopardize stability.
As a side note, disc_only_copies shouldn't be read into memory,
but if they are replicated, they are. Mnesia will always copy the
entire table (the most current copy) to the other nodes, by
traversing the dets file and sending the objects to a receiver on
the other node (aggregated into 8K chunks). There is some
tremendous optimization potential here (perhaps the next
experiment, but it's a non-trivial problem to solve...)
/Uffe
(These are not official statements of the mnesia team. ;)
--
Ulf Wiger, Senior Specialist,
/ / / Architecture & Design of Carrier-Class Software
/ / / Strategic Product & System Management
/ / / Ericsson Telecom AB, ATM Multiservice Networks
More information about the erlang-questions
mailing list