[erlang-bugs] Segmentation fault in Erlang R12B

Erik Demming erik_demming@REDACTED
Wed Jun 4 17:57:48 CEST 2008


I encounter a segmentation fault running
Erlang (BEAM) emulator version 5.6.2 [source] [64-bit] [smp:2] [async-threads:0] [kernel-poll:false]
Linux 2.6.18-6-amd64 #1 SMP x86_64 GNU/Linux

My Erlang program is handling web-requests (GET) storing them in an ets table, periodically switching the ets table by creating a new table and sending the finished ets table data to another remote erlang node.

Here the shortcut version of what am I doing:
    DataList = ets:tab2list(FinishedEts),
    Data = term_to_binary(DataList),
    gen_server:call(Pid, {data, Data}),

The segfault is not reproducible it
 occurs in a few minutes
 after start of the web server. Load on the server is not an issue - only a 50 Req/s.
Segfault also comes after switching off smp and kernel-poll.

I also saved the last block (DataList) to a file which caused the segfault. But manually re-sending the block did not lead to a segfault.

The problem exists on all R12B-x versions.
On installing R11B-5 on all servers no segfault was generated at all.

Here you can see what valgrind could identify:

==6149== Memcheck, a memory error detector.
==6149== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al.
==6149== Using LibVEX rev 1658, a library for dynamic binary translation.
==6149== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP.
==6149== Using valgrind-3.2.1-Debian, a dynamic binary instrumentation framework.
==6149== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al.
==6149== For more details, rerun with: -v
==6149==
 
==6149== My
 PID = 6149, parent PID = 1.  Prog and args are:
==6149==    /usr/local/lib/erlang/erts-5.6.2/bin/beam.smp
==6149==    --
==6149==    -root
==6149==    /usr/local/lib/erlang
==6149==    -progname
==6149==    erl
==6149==    --
==6149==    -home
==6149==    /root
==6149==    -name
==6149==    srv
==6149==    -setcookie
==6149==    xxxx
==6149==    -noshell
==6149==    -eval
==6149==    error_logger:logfile({open, '../log/log.txt'}).
==6149==    -eval
==6149==    reloader:start().
==6149==    -eval
==6149==    net_adm:ping('manager@REDACTED').
==6149==   
 -eval
==6149==    net_adm:ping('dist@REDACTED').
==6149==    -eval
==6149==    net_adm:ping('dist@REDACTED').
==6149==    -eval
==6149==    application:start(webnode).
==6149==    -eval
==6149==    net_adm:ping('dist@REDACTED').
==6149==    -eval
==6149==    net_adm:ping('dist@REDACTED').
==6149== 
==6149== Thread 6:
==6149== Invalid read of size 8
==6149==    at 0x45A8F2: copy_struct (copy.c:440)
==6149==    by 0x47783E: erts_send_message (erl_message.c:734)
==6149==    by 0x469E0A: do_send (bif.c:1842)
==6149==    by 0x46A3D5: send_2 (bif.c:1918)
==6149==    by 0x4D594A: process_main (beam_emu.c:1352)
==6149==    by
 0x47C2A9: sched_thread_func (erl_process.c:741)
==6149==    by 0x51CBD3: thr_wrapper (ethread.c:474)
==6149==    by 0x5007F19: start_thread (in /lib/libpthread-2.3.6.so)
==6149==    by 0x52E95D1: clone (in /lib/libc-2.3.6.so)
==6149==  Address 0x579E058 is 0 bytes after a block of size 262,184 alloc'd
==6149==    at 0x4A1B858: malloc (vg_replace_malloc.c:149)
==6149==    by 0x4F06FB: erts_sys_alloc (sys.c:2415)
==6149==    by 0x42E89D: create_carrier (erl_alloc_util.c:484)
==6149==    by 0x42EF7F: erts_alcu_start (erl_alloc_util.c:2966)
==6149==    by 0x434E26: erts_gfalc_start (erl_goodfit_alloc.c:241)
==6149==    by 0x42216F: start_au_allocator (erl_alloc.c:771)
==6149==    by 0x42892F: erts_alloc_init (erl_alloc.c:575)
==6149==    by 0x43B95E:
 early_init (erl_init.c:640)
==6149==    by 0x43BAC8: erl_start (erl_init.c:669)
==6149==    by 0x420198: main (erl_main.c:28)
==6149== 
==6149== Invalid read of size 8
==6149==    at 0x45A8E0: copy_struct (copy.c:441)
==6149==    by 0x47783E: erts_send_message (erl_message.c:734)
==6149==    by 0x469E0A: do_send (bif.c:1842)
==6149==    by 0x46A3D5: send_2 (bif.c:1918)
==6149==    by 0x4D594A: process_main (beam_emu.c:1352)
==6149==    by 0x47C2A9: sched_thread_func (erl_process.c:741)
==6149==    by 0x51CBD3: thr_wrapper (ethread.c:474)
==6149==    by 0x5007F19: start_thread (in /lib/libpthread-2.3.6.so)
==6149==    by 0x52E95D1: clone (in /lib/libc-2.3.6.so)
==6149==  Address 0x579E060 is 8 bytes after a block of size 262,184
 alloc'd
==6149==    at 0x4A1B858: malloc (vg_replace_malloc.c:149)
==6149==    by 0x4F06FB: erts_sys_alloc (sys.c:2415)
==6149==    by 0x42E89D: create_carrier (erl_alloc_util.c:484)
==6149==    by 0x42EF7F: erts_alcu_start (erl_alloc_util.c:2966)
==6149==    by 0x434E26: erts_gfalc_start (erl_goodfit_alloc.c:241)
==6149==    by 0x42216F: start_au_allocator (erl_alloc.c:771)
==6149==    by 0x42892F: erts_alloc_init (erl_alloc.c:575)
==6149==    by 0x43B95E: early_init (erl_init.c:640)
==6149==    by 0x43BAC8: erl_start (erl_init.c:669)
==6149==    by 0x420198: main (erl_main.c:28)
==6149== 
==6149== Process terminating with default action of signal 11 (SIGSEGV)
==6149==  Bad permissions for mapped region at address 0x585E000
==6149==    at
 0x45A8E0: copy_struct (copy.c:441)
==6149==    by 0x47783E: erts_send_message (erl_message.c:734)
==6149==    by 0x469E0A: do_send (bif.c:1842)
==6149==    by 0x46A3D5: send_2 (bif.c:1918)
==6149==    by 0x4D594A: process_main (beam_emu.c:1352)
==6149==    by 0x47C2A9: sched_thread_func (erl_process.c:741)
==6149==    by 0x51CBD3: thr_wrapper (ethread.c:474)
==6149==    by 0x5007F19: start_thread (in /lib/libpthread-2.3.6.so)
==6149==    by 0x52E95D1: clone (in /lib/libc-2.3.6.so)
==6149== 
==6149== ERROR SUMMARY: 56217 errors from 2 contexts (suppressed: 8 from 1)
==6149== malloc/free: in use at exit: 16,315,457 bytes in 38 blocks.
==6149== malloc/free: 46,027 allocs, 45,989 frees, 21,041,414 bytes allocated.
==6149== For counts of detected errors, rerun with: -v
==6149== searching for
 pointers to 38 not-freed blocks.
==6149== checked 226,037,216 bytes.
==6149== 
==6149== LEAK SUMMARY:
==6149==    definitely lost: 0 bytes in 0 blocks.
==6149==      possibly lost: 680 bytes in 5 blocks.
==6149==    still reachable: 16,314,777 bytes in 33 blocks.
==6149==         suppressed: 0 bytes in 0 blocks.
==6149== Reachable blocks (those to which a pointer was found) are not shown.
==6149== To see them, rerun with: --show-reachable=yes

Would be great if anybody has a clue on that.

Thanks in advance,
Erik



      __________________________________________________________
Gesendet von Yahoo! Mail.
Dem pfiffigeren Posteingang.
http://de.overview.mail.yahoo.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20080604/64c2546a/attachment.htm>


More information about the erlang-bugs mailing list