R11B-0 SMP segfaults in unlink_free_block
Richard Cameron
camster@REDACTED
Sun Jul 30 12:26:26 CEST 2006
Hi,
I'm seeing fairly regular segmentation faults on R11B-0 on a 2-CPU 64-
bit Linux box. It's compiled from source with no special options
(other than ./configure --prefix=/opt/erlang). I got beam.smp to dump
core, and I've attached a strack trace from gdb below.
erts would have been compiled with gcc -O3, so there's a vague
possibility that the stack trace is slightly bogus. However, it seems
to go wrong only in SMP mode, and the the offending section is called
from somewhere in time.c with the rather frightening looking comment:
/* Here comes hairy use of the timer fields!
* They are reset without having the lock.
* It is assumed that no code but this will
* accesses any field until the ->timeout
* callback is called.
*/
p->next = NULL;
p->slot = 0;
(*p->timeout)(p->arg);
The application probably has several thousand erlang processes
spawned, most of which are performing this sort of hybrid poll/
receive pattern:
loop() ->
receive
event ->
handle_event(),
after Timeout ->
poll_external_system()
end,
loop().
Is it possible my code's picking out an obscure race condition in the
new SMP code?
---
(smithers)lisa:~% gdb /opt/erlang/lib/erlang/erts-5.5/bin/beam.smp
core.12735
GNU gdb Red Hat Linux (6.3.0.0-1.96rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "x86_64-redhat-linux-gnu"...Using host
libthread_db library "/lib64/tls/libthread_db.so.1".
Core was generated by `/opt/erlang/lib/erlang/erts-5.5/bin/beam.smp
-- -root /opt/erlang/lib/erlang -p'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib64/libdl.so.2...done.
[...]
Loaded symbols for /usr/lib64/libz.so.1
#0 0x000000000043729e in unlink_free_block (allctr=0x677a00,
block=0x6e4a38)
at beam/erl_goodfit_alloc.c:452
452 blk->prev->next = blk->next;
(gdb) bt
#0 0x000000000043729e in unlink_free_block (allctr=0x677a00,
block=0x6e4a38)
at beam/erl_goodfit_alloc.c:452
#1 0x000000000043305b in mbc_free (allctr=0x677a00, p=Variable "p"
is not available.
)
at beam/erl_alloc_util.c:731
#2 0x0000000000436555 in erts_alcu_free_ts (type=Variable "type" is
not available.
)
at beam/erl_alloc_util.c:2221
#3 0x000000000047cafe in timer_thread_start (ignore=Variable
"ignore" is not available.
) at beam/time.c:292
#4 0x000000000050bea8 in thr_wrapper (vtwd=Variable "vtwd" is not
available.
) at common/ethread.c:440
#5 0x00000039a100610a in start_thread () from /lib64/tls/
libpthread.so.0
#6 0x000000399fdc5ee3 in clone () from /lib64/tls/libc.so.6
#7 0x0000000000000000 in ?? ()
More information about the erlang-bugs
mailing list