mnesia replication (Are there checksums?)
Francesco Cesarini (Erlang Training & Consulting)
Thu Sep 1 08:36:53 CEST 2005
I am amazed we never came across this bug (ok, feature :-) ) before. I
would have expected an alarm to be generated as soon as the databases
became inconsistent. I guess a way to come around the problem is to hash
the dirty writes across the nodes based on the key.
How hard would it be to add a checksum to each table? It should not
generate any major overheads... The subject had been discussed, but
probably before you took over the reins.
all my best,
Hakan Mattsson wrote:
> No, there are no table checksums. Mnesia relies on
> other recovery mechanisms.
> The behaviour that you call a serious bug, is
> deliberate. Normally all database accesses should be
> performed within transactions. If the performance is
> good enough you should not use dirty access at
> all. The only reason for using dirty access is to
> gain better performance. But that does not come for
> free, as you need to deal with almost all
> concurrency issues yourself. One of these issues is
> serialization of updates. If this is unexpected, the
> documentation should be blamed (or possibly the
> reader of the documentation ;-).
> On Wed, 31 Aug 2005, Francesco Cesarini (Erlang Training & Consulting) wrote:
> FC> I would class this as a serious bug! I have a vague recollection that
> FC> there was a checksum being computed for every table, but have looked
> FC> everywhere and can not find any reference for it. Maybe it was just a
> FC> discussion I had with some one 10 years ago, or something... Did it ever
> FC> happen?
> FC> Francesco
> FC> --
> FC> http://www.erlang-consulting.com
> FC> Dan Gudmundsson wrote:
> FC> > chandru writes:
> FC> > > Hi,
> FC> > > > On 30/08/05, Serge Aleynikov <serge@REDACTED> wrote:
> FC> > > > Hello,
> FC> > > > > > Could someone comment on the effect of short network outages
> FC> > ( < 10-15
> FC> > > > s) on mnesia replication and how to prevent the inconsistency
> FC> > > > demonstrated in the example below? I intentionally did not alter
> FC> > the
> FC> > > > net_ticktime kernel parameter so that it would be greater than the
> FC> > > > duration of the brief network outage.
> FC> > > > You can't really prevent this inconsistency if you are using
> FC> > dirty
> FC> > > operations. Have you tried the same test using transactions instead
> FC> > of
> FC> > > dirty operations.
> FC> >
> FC> > Since dirty_operation don't grab a lock you should be able see the same
> FC> > problem
> FC> > with a working network ..
> FC> >
> FC> > Dirty is dirty, be aware of that.
> FC> >
> FC> > /Dan
More information about the erlang-questions