[erlang-questions] Call for Contributions: Mnesia best practices

Thu Jul 3 01:19:29 CEST 2008

On Jul 2, 2008, at 4:46 PM, Bob Calco wrote:
> I'm looking for thoughts from fellow Erlangers about database design &
> implementation in Mnesia. With a heavy SQL background I, like many
> relatively new Erlang folks I'm sure, have a tendency to think in  
> terms of
> the capabilities of traditional RDBMSs, and to try to normalize  
> every data
> model with which I come into contact.

uninformed comments..

> The question is: What is the best advice you could give a data  
> architect
> about designing and implementing a database in Mnesia from scratch?  
> Examples
> of the kinds of issues I'd like to see folks address:

First of all.

Mnesia is not actually fully fledged relational database.
It is simple key-value thingy with some query capabilities thrown in.

On the other hand it is really well distributed.

> * How to create an optimal data model for performance (vs. reporting,
> comparing the SQL way to the Mnesia way). This question is really  
> about
> normalization in Mnesia vs. SQL, and tricks like storing whole  
> records in
> table fields.

There is no silver bullet;)

Look at application requirements - for queries used 80% of time is  
wise to give
special attention. And not ownly on database level. Probably much more  
on application
level - should something to be cached, precomputed, distirbuted etc..

> * How to partition data between subsystems, without losing the  
> illusion
> they're all one big happy system.

What? Why should application see partitioning? Hide partitioning
from consumer with some middleman.

Look for classic example of map -> pmap evolution
http://www.erlang.org/ml-archive/erlang-questions/200606/msg00187.html

> * How to handle complex clustering and failover scenarios

Programming reliable systems:
http://www.sics.se/~joe/thesis/armstrong_thesis_2003.pdf

Failures will happen - just fail fast and recover fast enough;)

> * How to handle calculation-intensive databases (for example, stock
> databases that need to constantly recalculate certain attributes for  
> the
> purposes of sorting, searching)

Real time? Historical? Process per requesting user?

> * How to handle complex domain relationships. For example, let's say  
> you are
> writing a CRM tool and want to store each "person" in the database.  
> But each
> person can also be a colleague, or a client, or an incidental  
> character
> (contact person at some organization). E.e, What do you do when  
> there is
> inheritance in your domain model?

Ask from "person" who he is? I.e. create separate process/server/ 
distributed application
and ask via some protocol. Model each entity as process, which knows  
all messages
which can be asked from him.

> * What are some current pitfalls or "weak spots" of Mnesia that  
> ought to be
> avoided, however tempting they might be?

No "generic" solution for recovering partitioned network.
Recovery time after crash.
HUGE datasets
No nice tools like oracle enterprise manager
low level, no referential constraints - i.e. it is not fully fledged  
RDBMS

best regards,
taavi