avoiding overloading mnesia

Wed Aug 12 11:48:22 CEST 2009

There are some reoccuring themes when it comes to mnesia:

1 Mnesia handles partitioned networks poorly
2 Mnesia doesn't scale
3 Stay away from transactions

I've argued that Mnesia provides the tools to handle [1],
and that most DBMSs that guarantee transaction-level
consistency are hard-pressed to do better. A few offer
functionality (e.g. MySQL Cluster's Arbitrator) that could
be added on top of the basic functionality provided by
Mnesia. DBMSs that offer 'Eventual consistency' may fare
better. OTOH, one should really think about what the
consistency requirements of the application are, and pick
a DBMS that aims for that level.

Regarding [2], there are examples of Mnesia databases that
have achieved very good scalability. It is not the best
regarding writes/second to persistent storage, but as with
[1], think about what your requirements are. Tcerl, just
to name an example, gives much better write throughput, but
requires you to explicitly flush to disk. Chances are that
your data loss will be much greater if you suffer e.g.
a power failure. Don't take this as criticism of tcerl, but
think about what your recovery requirements are.

I am very wary about [3], mainly because I've seen many
abuses of dirty operations, and observed that many who use
dirty updates do it just because "it has to be fast",
without having measured performance using transactions, or
thought about what they give up when using dirty updates.

In some cases, transactions can even be faster than dirty.
This is mainly true if you are doing batch updates on a
table with many replicas. With dirty, you will replicate
once for each write, whereas a transaction will replicate
all changes in the commit message. Taking a table lock will
more or less eliminate the locking overhead in this case,
and sticky locks can make it even cheaper.

Apart from the obvious problems with dirty writes (no
concurrency protection above object-level atomicity,
no guarantee that the replicas will stay consistent),
there is also a bigger problem of overload.

If you have a write-intensive system, and most writes
take place from one node, and are replicated to one or
more others, consider that the replication requests all
go through the mnesia_tm process on the remote node,
while the writers perform the 'rpc' from within their
own process. Thus, if you have thousands of processes
writing dirty to a table, the remote mnesia_tm process(es)
may well become swamped.

This doesn't happen as easily with transactions, since all
processes using transactions also have to go through their
local mnesia_tm.

One thing that can be done to mitigate this is to use
sync_dirty. This will cause the writer to wait for the
remote mnesia_tm process(es) to reply. If you have some
way of limiting the number of writers, you ought to be able
to protect against this kind of overload.

My personal preference is to always start with transactions,
until they have proven inadequate. Most of the time, I find
that they are just fine, but YMMV.

BR,
Ulf W
-- 
Ulf Wiger
CTO, Erlang Training & Consulting Ltd
http://www.erlang-consulting.com