Mnesia, disk logging, and synchronous disk logging

Dan Gudmundsson dgud@REDACTED
Thu Jan 26 08:29:58 CET 2006


Talking about poor mans solutions, you can also use mnesia:dump_log(),
which closes the files after operation. 

The log dumping is otherwise automatic which you can control with time or
number of transactions, see manual.

A per transaction disk sync option requires some hacking though.

/Dan

Scott Lystig Fritchie writes:
 > >>>>> "hm" == Hakan Mattsson <hakan@REDACTED> writes:
 > 
 > hm> In Mnesia the coordinator does always wait synchronously for 2PC
 > hm> (and 3PC) votes from all participants, regardless of the
 > hm> transaction being "synchronous" or not.
 > 
 > That makes sense ... the coordinator can do Very Bad Things if it
 > doesn't gather all votes.
 > 
 > hm> I agree that such a feature can be useful.  At least if the there
 > hm> are no write caches enabled in the disk hardware. Otherwise you
 > hm> could lose some data anyway in case of a power failure.
 > 
 > Even if your disk subsystem(*) has an NVRAM write-back cache, there is
 > risk of data loss unless you explicitly the fsync(2) system call.
 > 
 > With Mnesia using the disk_log module, which in turn usually uses
 > write(2) only, you are not certain that the OS will have copied
 > write(2)'s data to the disk device.  In most cases, the kernel can
 > (and will) wait for many seconds before flushing that data to the disk
 > device.
 > 
 > SLF> But I can't find a Mnesia transaction knob/button that I can
 > SLF> twist/press to request that level of safety.  Is there such a
 > SLF> thing?
 > 
 > hm> No currently there are no such thing in Mnesia.
 > 
 > That's what I'd thought.
 > 
 > Assuming that I wanted to try to add that to Mnesia ... I think I'd
 > need to add extra info to the commit record that's sent to each
 > participant.  Something that said: this log record is important enough
 > to use fsync after writing.  Hm.
 > 
 > I suppose a poor man's safety net would be to run a shell script like
 > this on each Mnesia node with disc_copies or disc_only_copies:
 > 
 >     while [ 1 ]; do
 >         sync
 >         sleep 1
 >     done
 > 
 > Easy to do, doesn't require code changes, and would limit worst-case
 > data loss to roughly 1-2 seconds.  (Assuming that disc_log and the
 > file Port that disc_log uses do not do any buffering.)  On the other
 > hand, performance may suck.
 > 
 > Too bad disk drives are so too darn slow.
 > 
 > -Scott
 > 
 > (*) Even if the disk logical device is a NVRAM/solid-state disk drive.

-- 
Dan Gudmundsson               Project:    Mnesia, Erlang/OTP
Ericsson Utvecklings AB       Phone:      +46  8 727 5762 
UAB/F/P                       Mobile:     +46 70 519 9469
S-125 25 Stockholm            Visit addr: Armborstv 1 




More information about the erlang-questions mailing list