[erlang-questions] Maximum number of Mnesia nodes

Hakan Mattsson hakan@REDACTED
Mon Jul 30 17:25:35 CEST 2007


If you replicate the tables to all nodes, the
performance of updates will be worse for each new node
that you add. I think that the performance characteristics
of such a transaction will aproximately follow this 
formula: C + N * P. Where N is the number of nodes, P
is the work performed for each (remote) transaction
participant and C is the (local) transaction coordinator
work that is independent of the number of nodes.

But do you really need to replicate the data to all nodes?

Even if you have relatively few records in your database,
fragmented tables can be very useful in order to distribute
the load over many nodes.

If you can identify some type of record in your
application that is being accessed in the majority of
your transactions, it is a good candidate for fragmented
table storage. This could be a bank account, a subscriber,
a session etc. etc. You can also co-locate records in other
tables with your main record (see "foreign_key" in Mnesia).

When your application needs to access such a record it
should determine one of the replica nodes for the table
and run the transaction on that node. If this can be
achieved for all transactions there will always be a
fixed number of (2-3?) nodes  involved in each transaction.
That is the (2?) nodes where the table is replicated plus
one of the nodes from where  transactions are forwarded.
If you distribute the fragments over more nodes it will
scale smoothly.

Of course it is not possible to achieve this for all
types of applications. But you should strive for that
kind of access patterns in order to achieve better
scalability.

/Håkan

On Fri, 27 Jul 2007, denis wrote:

> Date: Fri, 27 Jul 2007 14:57:14 -0400
> From: denis <dloutrein.lists@REDACTED>
> To: 'Hakan Mattsson' <hakan@REDACTED>
> Cc: 'David King' <dking@REDACTED>, 'Joel Reymont' <joelr1@REDACTED>,
>     'Erlang Questions' <erlang-questions@REDACTED>
> Subject: RE: [erlang-questions] Maximum number of Mnesia nodes
> 
> Thanks Hakan for your response.
> 
> If I understand well, fragmented tables are interesting when we have high
> volume of data. In my case, that's not the case, around 100000 records on 5
> tables.
> I plan to have one mnesia instance per server (and one server per machine),
> each having each table in ram_copies replicated with the others servers.
> Each server uses his local mnesia instance (maybe that's not the better
> architecture?)
> 
> My concern is when for instance I do a delete or an insert into a table. The
> transaction succeeds only when the insert or delete are done on each
> replicated table. If I have only one server, the transaction time will be
> for instance 10ms. If I have two server replicated, will it be 2*10ms ? For
> N servers, what kind of factor can I expect? N, log(n), exp(N) ...?
> I'm not sure that I can run 20 servers for instance, and keep good
> performance, depending on the response time of the transaction commitment on
> each node.
> 
> Thanks
> Denis
> 
> 
> > -----Message d'origine-----
> > De : Hakan Mattsson [mailto:hakan@REDACTED]
> > Envoyé : vendredi 27 juillet 2007 13:09
> > À : denis
> > Cc : 'David King'; 'Joel Reymont'; 'Erlang Questions'
> > Objet : Re: [erlang-questions] Maximum number of Mnesia nodes
> > 
> > 
> > The scalability of Mnesia depends heavily of your
> > access patterns and how you have configured Mnesia.
> > 
> > If you ensure that the number of nodes involved in a
> > typical transaction is constant, Mnesia should scale
> > very well. One way of achieving linear scalability
> > characteristics, is to use the concept called
> > "foreign_key" in the chapter about fragmented
> > tables. The bench example (mnesia/examples/bench)
> > utilizes this technique.
> > 
> > When I wrote the "bench" benchmark example it turned
> > out to scale almost perfectly linear. (By distributing
> > the Mnesia tables over twice as many computers, the
> > number of processed transactions per second also
> > doubled.) But by that time I only had access to 10 (or
> > was it 16?) identical computers, so I cannot say
> > anything about how Mnesia scales beyond that. Worth to
> > mentition is that I also did successfully run the bench
> > example with fragmented tables distributed over all our
> > machines at the office (50+). But as those computers
> > had so different characteristics, it is impossible to
> > say anything about the scalability. It was fun that it
> > worked though.
> > 
> > Chandru, do you still have the highscore of the number
> > of Mnesia nodes in a production environment?
> > 
> > /Håkan
> > 
> > On Fri, 27 Jul 2007, denis wrote:
> > 
> > > Date: Fri, 27 Jul 2007 11:15:18 -0400
> > > From: denis <dloutrein.lists@REDACTED>
> > > To: 'David King' <dking@REDACTED>, 'Joel Reymont'
> > <joelr1@REDACTED>
> > > Cc: 'Erlang Questions' <erlang-questions@REDACTED>
> > > Subject: Re: [erlang-questions] Maximum number of Mnesia nodes
> > >
> > > Still nobody have a response for this?
> > >
> > > I'm in the case of designing a server embedding mnesia with ram_copies
> > > tables. Several instances of the server can be launched, and the mnesia
> > > tables are replicated.
> > > I'm having the same concern with how many instances I can run before the
> > > transaction committing in mnesia becomes a problem.
> > >
> > > If someone already used several replicated mnesia instances, I would
> > like to
> > > have some numbers.
> > >
> > > Thanks
> > > Denis
> > >
> > > > -----Message d'origine-----
> > > > De : erlang-questions-bounces@REDACTED [mailto:erlang-questions-
> > > > bounces@REDACTED] De la part de David King
> > > > Envoyé : lundi 23 juillet 2007 21:24
> > > > À : Joel Reymont
> > > > Cc : Erlang Questions
> > > > Objet : Re: [erlang-questions] Maximum number of Mnesia nodes
> > > >
> > > > Did you ever get any off-list responses to this? I'm curious too.
> > > >
> > > > On 15 Jul 2007, at 05:44, Joel Reymont wrote:
> > > >
> > > > > Folks,
> > > > >
> > > > > How many Mnesia nodes are you running in your production
> > > > > installation? I'm looking to find the maximum here.
> > > > >
> > > > > I'm only dealing with ram_copies tables (cache), trying to figure
> > out
> > > > > whether I can make every Yaws node a Mnesia node without slowing
> > > > > transactions down too much.
> > > > >
> > > > > My current thinking is to wait and gather statistics before trying
> > to
> > > > > decouple Yaws and Mnesia . Still, I would love to know how much
> > > > > transactions slow down with the addition of every new Mnesia node.
> > > > >
> > > > > 	Thanks, Joel
> > > > >
> > > > > --
> > > > > http://topdog.cc      - EasyLanguage to C# compiler
> > > > > http://wagerlabs.com  - Blog


More information about the erlang-questions mailing list