[erlang-questions] Strange crash and hang
David Welton
davidnwelton@REDACTED
Fri Sep 19 10:59:42 CEST 2014
Hi,
We're getting an odd crash that I suspect is related to one we had in
the past, where we were getting a bad term from a C node. Since the
term being sent is pretty large (20+ megabytes), when this gets caught
up in the error logger and turned from a large binary into a
linked-list of the string representation of that binary, it kills
Erlang.
What's particularly odd about it is that many times, it doesn't finish
writing the crash dump, it just sits there:
Gdb:
#0 0x00007f0a0c020e93 in select () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00000000005495e0 in erts_sys_main_thread ()
#2 0x00000000004561d0 in erl_start ()
#3 0x00000000004367d9 in main ()
strace -f -p 9205
Process 9205 attached with 19 threads - interrupt to quit
[pid 9248] futex(0x7f0a0b0c0550, FUTEX_WAIT_PRIVATE, 4294967295, NULL
<unfinished ...>
[pid 9247] futex(0x7f0a0b0c0510, FUTEX_WAIT_PRIVATE, 4294967295, NULL
<unfinished ...>
[pid 9246] futex(0x7f0a0b0c04d0, FUTEX_WAIT_PRIVATE, 4294967295, NULL
<unfinished ...>
[pid 9245] futex(0x7f0a0b0c0490, FUTEX_WAIT_PRIVATE, 4294967295, NULL
<unfinished ...>
[pid 9244] futex(0x7f0a0b0c0450, FUTEX_WAIT_PRIVATE, 4294967295, NULL
<unfinished ...>
[pid 9243] wait4(-1, <unfinished ...>
[pid 9241] futex(0x7f0a0b0c0390, FUTEX_WAIT_PRIVATE, 4294967295, NULL
<unfinished ...>
[pid 9242] futex(0x7f0a0b0c03d0, FUTEX_WAIT_PRIVATE, 4294967295, NULL
<unfinished ...>
[pid 9240] futex(0x7f0a0b0c0350, FUTEX_WAIT_PRIVATE, 4294967295, NULL
<unfinished ...>
[pid 9239] futex(0x7f0a0b0c0310, FUTEX_WAIT_PRIVATE, 4294967295, NULL
<unfinished ...>
[pid 9238] futex(0x7f0a0b0c02d0, FUTEX_WAIT_PRIVATE, 4294967295, NULL
<unfinished ...>
[pid 9236] futex(0x7f0a0b0c0250, FUTEX_WAIT_PRIVATE, 4294967295, NULL
<unfinished ...>
[pid 9237] futex(0x7f0a0b0c0290, FUTEX_WAIT_PRIVATE, 4294967295, NULL
<unfinished ...>
[pid 9235] futex(0x7f0a0b0c0210, FUTEX_WAIT_PRIVATE, 4294967295, NULL
<unfinished ...>
[pid 9234] futex(0x7f0a0b0c01d0, FUTEX_WAIT_PRIVATE, 4294967295, NULL
<unfinished ...>
[pid 9233] futex(0x7f0a0b0c0190, FUTEX_WAIT_PRIVATE, 4294967295, NULL
<unfinished ...>
[pid 9232] futex(0x888484, FUTEX_WAIT_PRIVATE, 3, NULL <unfinished ...>
[pid 9231] read(6, <unfinished ...>
[pid 9205] select(0, NULL, NULL, NULL, NULL^C <unfinished ...>
It's not writing to the crash dump file or doing anything really, it's
just stuck in that select.
Any ideas what may be happening? Here's the beginning of the crash dump.
Slogan: eheap_alloc: Cannot allocate 3049303816 bytes of memory (of
type "heap_frag").
System version: Erlang/OTP 17 [erts-6.1] [source] [64-bit] [smp:4:4]
[async-threads:10] [hipe] [kernel-poll:false]
Compiled: Mon Jul 7 15:00:04 2014
Taints: crypto
Thoughts or ideas? I'm going to try dumping in a recompiled beam.smp
in order to see if I can dump a simple debug message when things crash
on a bad term.
Thanks
--
David N. Welton
http://www.welton.it/davidw/
http://www.dedasys.com/
More information about the erlang-questions
mailing list