[erlang-questions] Rare emulator crash
Martin Carlson
martin@REDACTED
Fri Feb 22 13:31:26 CET 2008
Hi all,
We are running a set of nodes in a distributed environment running R11B
and OpenSuSe 10.2.
We have recently observed an emulator crash which leaves no
erl_crash.dump nor any core.
However it is terminated with the following message:
*** glibc detected *** /home/erix/erts-5.5.3/bin/beam.smp:
double free or corruption (out): 0xb45018b8 ***
======= Backtrace: =========
/lib/libc.so.6[0xb7dce6e1]
/lib/libc.so.6(cfree+0x89)[0xb7dcfd79]
/home/erix/erts-5.5.3/bin/beam.smp(driver_free+0x21)[0x80b14a1]
/home/erix/erts-5.5.3/bin/beam.smp[0x80b7db9]
======= Memory map: ========
08048000-0817e000 r-xp 00000000 08:07 5543376
/home/erix/erts-5.5.3/bin/beam.smp
0817e000-0817f000 r-xp 00135000 08:07 5543376
/home/erix/erts-5.5.3/bin/beam.smp
0817f000-081a7000 rwxp 00136000 08:07 5543376
/home/erix/erts-5.5.3/bin/beam.smp
081a7000-081f9000 rwxp 081a7000 00:00 0 [heap]
b3a16000-b3e4b000 rwxp b3a16000 00:00 0
b3ecb000-b40cc000 rwxp b3ecb000 00:00 0
b41cc000-b42e7000 r-xp 00000000 08:06 2354496
/usr/lib/libcrypto.so.0.9.8
b42e7000-b42ed000 r-xp 0011b000 08:06 2354496
/usr/lib/libcrypto.so.0.9.8
b42ed000-b42fc000 rwxp 00121000 08:06 2354496
/usr/lib/libcrypto.so.0.9.8
b42fc000-b4508000 rwxp b42fc000 00:00 0
b4508000-b4600000 ---p b4508000 00:00 0
b46ae000-b46b8000 r-xp 00000000 08:06 870104 /lib/libgcc_s.so.1
b46b8000-b46ba000 rwxp 00009000 08:06 870104 /lib/libgcc_s.so.1
b46ba000-b46cb000 r-xp 00000000 08:06 870117 /lib/libz.so.1.2.3
b46cb000-b46cd000 rwxp 00010000 08:06 870117 /lib/libz.so.1.2.3
b46d7000-b46de000 r-xs 00000000 08:06 2380765
/usr/lib/gconv/gconv-modules.cache
b46de000-b4719000 r-xp 00000000 08:06 2397784
/usr/lib/locale/en_US.utf8/LC_CTYPE
b4719000-b4a1c000 rwxp b4719000 00:00 0
b4a1c000-b4a1d000 ---p b4a1c000 00:00 0
b4a1d000-b521d000 rwxp b4a1d000 00:00 0
b521d000-b521e000 ---p b521d000 00:00 0
b521e000-b5a1e000 rwxp b521e000 00:00 0
b5a1e000-b5a1f000 ---p b5a1e000 00:00 0
b5a1f000-b621f000 rwxp b5a1f000 00:00 0
b621f000-b6220000 ---p b621f000 00:00 0
b6220000-b6a20000 rwxp b6220000 00:00 0
b6a20000-b6a21000 ---p b6a20000 00:00 0
b6a21000-b7221000 rwxp b6a21000 00:00 0
b7221000-b7222000 ---p b7221000 00:00 0
b7222000-b7d6a000 rwxp b7222000 00:00 0
b7d6a000-b7e92000 r-xp 00000000 08:06 870060 /lib/libc-2.5.so
b7e92000-b7e93000 r-xp 00128000 08:06 870060 /lib/libc-2.5.so
b7e93000-b7e95000 rwxp 00129000 08:06 870060 /lib/libc-2.5.so
b7e95000-b7e98000 rwxp b7e95000 00:00 0
b7e98000-b7e9f000 r-xp 00000000 08:06 870090 /lib/librt-2.5.so
b7e9f000-b7ea1000 rwxp 00006000 08:06 870090 /lib/librt-2.5.so
b7ea1000-b7edc000 r-xp 00000000 08:06 870112
/lib/libncurses.so.5.5
b7edc000-b7ee3000 r-xp 0003a000 08:06 870112
/lib/libncurses.so.5.5
b7ee3000-b7ee8000 rwxp 00041000 08:06 870112
/lib/libncurses.so.5.5
b7ee8000-b7ee9000 rwxp b7ee8000 00:00 0
b7ee9000-b7efd000 r-xp 00000000 08:06 870086
/lib/libpthread-2.5.so
b7efd000-b7eff000 rwxp 00013000 08:06 870086
/lib/libpthread-2.5.so
b7eff000-b7f01000 rwxp b7eff000 00:00 0
b7f01000-b7f25000 r-xp 00000000 08:06 870068 /lib/libm-2.5.so
b7f25000-b7f27000 rwxp 00023000 08:06 870068 /lib/libm-2.5.so
b7f27000-b7f29000 r-xp 00000000 08:06 870066 /lib/libdl-2.5.so
b7f29000-b7f2b000 rwxp 00001000 08:06 870066 /lib/libdl-2.5.so
b7f2b000-b7f2d000 r-xp 00000000 08:06 870094 /lib/libutil-2.5.so
b7f2d000-b7f2f000 rwxp 00001000 08:06 870094 /lib/libutil-2.5.so
b7f34000-b7f37000 r-xp 00000000 08:07 5544113
/home/erix/lib/crypto-1.5/priv/lib/crypto_drv.so
b7f37000-b7f38000 r-xp 00002000 08:07 5544113
/home/erix/lib/crypto-1.5/priv/lib/crypto_drv.so
b7f38000-b7f39000 rwxp 00003000 08:07 5544113
/home/erix/lib/crypto-1.5/priv/lib/crypto_drv.so
b7f39000-b7f3a000 rwxp b7f39000 00:00 0
b7f3a000-b7f3b000 r-xp b7f3a000 00:00 0 [vdso]
b7f3b000-b7f56000 r-xp 00000000 08:06 870053 /lib/ld-2.5.so
b7f56000-b7f58000 rwxp 0001a000 08:06 870053 /lib/ld-2.5.so
bf9d7000-bf9ec000 rwxp bf9d7000 00:00 0 [stack]
heart: Thu Feb 21 13:18:34 2008: Erlang has closed.
And is restarted by heart.
The system is fairly heavy on TCP/IP traffic and file IO. The crash
manifests itself when network interfaces are shut down on the switches.
Which leaves me suspecting the inets driver, unfortunately i have no
back trace to narrow down the problem at this point.
As shown buy glibc debug printout it seems like free is called on an
invalid pointer,
Have anybody seen this before while running with standard OTP drivers
(crypto), SMP (4 proc), Kernel poll and IO threads?
//Martin Carlson
Erlang Training & Consulting
http://www.erlang-consulting.com
More information about the erlang-questions
mailing list