[erlang-questions] understanding the scaleability limits of erlang and mnesia
Brian Acton
acton@REDACTED
Tue Jan 26 21:19:10 CET 2010
I have a mix of ram, disc, and disc_only table configurations.
My hardware config is mixed. One box with single drive (to be end of lifed)
and 3 boxes with raid 5 to help with overall throughput of disk i/o.
My disc_only tables:
do1 : with 777805 records occupying 111980141 bytes on disc
do2 : with 594324 records occupying 639543647 bytes on disc
do3 : with 1837761 records occupying 512674458 bytes on disc
My disc tables:
d1 : with 1112655 records occupying 73119677 words of mem
d2 : with 1117441 records occupying 143819464 words of mem
My ram tables:
r1 : with 3493 records occupying 791941 words of mem
r2 : with 10482 records occupying 1976194 words of mem
r3 : with 14160 records occupying 520918 words of mem
r4 : with 3759 records occupying 79983 words of mem
Overall, it looks like about 1-2GB of data that would need to be replicated
/ xferred during startup. Is that correct?
Can you explain more about how index uniqueness affects recover / startup
times ?
Thx,
--b
On Tue, Jan 26, 2010 at 11:46 AM, Paul Fisher <pfisher@REDACTED>wrote:
> First of all, you need to specify the type of mnesia tables you are
> using. I am going to assume you are using disk_copies, since disk_only
> and ram_only should not act the way you describe.
>
> Second, with disk_copies the tables should recover at the speed the file
> can be read from disk. Typically this is 80M/s+ for even a single SATA
> drive, so even large tables should be fast. While the size of the
> dataset does affect the table start time, it is more likely that you are
> seeing a problem with the uniqueness of either the index of the table,
> or of a secondary index. If you have a good unique index for the table,
> but also have a secondary index with a limited number of values, the
> table will recover as you describe.
>
> On Tue, 2010-01-26 at 13:04 -0600, Brian Acton wrote:
> > Hi Guys,
> >
> > I'm fairly new to erlang and I'm trying to understand better how erlang
> and
> > mnesia deal with large scale. I'm wondering if anyone could provide some
> > examples where they have been using erlang in a very large configuration
> > (i.e. more than 10 machines / more than 100 machines). I specifically am
> > interested where people are running in a clustered configuration with an
> > mnesia backing store to their application.
> >
> > It's been my experience that as much as a technology claims to be
> scalable,
> > operability issues usually surface that make it bad in practice to simply
> > just add more machines to the cluster. As an example, in my current
> > configuration, I am experiencing a 10 minute mnesia recovery /
> verification
> > time during node startup. If I try to bring up two nodes at the same
> time, I
> > see even longer times and sometimes even failure during bring up. And my
> > cluster is only four nodes in size. Of course, when the system is at
> steady
> > state (i.e. all nodes up and running), it's awesome. However, when I have
> to
> > go through a crash / recovery cycle, I usually want to shoot myself....
> >
> > Anyone got any war stories to share? Any papers or presentations that I
> > should look at?
> >
> > Thanks muchly,
> >
> > --b
>
>
>
More information about the erlang-questions
mailing list