Emulator stopping during mnesia writes
Scott Lystig Fritchie
scott@REDACTED
Wed May 10 17:46:56 CEST 2000
>>>>> "sh" == Sean Hinde <Sean.Hinde@REDACTED> writes:
sh> Working under my commercial support agreement with the guys in
sh> Sweden we have established that as the emulator is single threaded
sh> with regard to disk operations, a disk operation which is slow
sh> will stop the emulator.
The output of "sar" or "iostat" would help confirm this. If say a 1
minute snapshot of activity from "sar -d 5 12" shows that the
disk/volume that the Mnesia log is written to is busier than 70%
("%busy"), the average I/O queue length is above a handful
("avqueue"), and/or the average seek time is above 20 ms ("avseek"),
you've got a disk/volume that's too busy. (I'm pulling those numbers
out of my memory ... I highly recommend Adrian Cockroft's book on
Solaris performance tuning book (2nd edition), if you don't already
have it.)
Over-busy disk spindles are the bane of Usenet News, email, and
database servers universally. In my INN hacking days, I've seen
hopelessly overloaded INN servers (using "truss") take over a second
just to perform a single open("/var/spool/news/alt/whatever/989832",
O_CREAT|O_EXCL|O_WRONLY) system call. Problems with too many files in
a single directory on an FFS-based file system (as Sun's UFS is) were
only exacerbated by having the disks in the file system (using the
Solstice volume manager) at over 95% busy on average.
If indeed the disk/disks storing the file system with the mnesia logs
is too busy, the solutions are few: software-based striping across
multiple disk drives, hardware-based striping across multiple disk
drives, solid state disk drives, or algorithmic changes to reduce I/O
workload. The last may be more difficult to do than the former
three. :-)
-Scott
More information about the erlang-questions
mailing list