Erlang seg faults while running ejabberd

tty <>
Mon Mar 14 22:44:19 CET 2005


Hi,

Apologies if this was sent twice.

Erlang seg. faults while running ejabberd. My particular environment is:

SuSE v9.0
Linux 2.4.21-243-default
gcc version 3.3.1 (SuSE Linux)
Erlang (HIPE) (BEAM) emulator version 5.4.4
ejabberd 0.7.5

Backtrace result is:


#0  0x4010fde8 in strcmp () from /lib/i686/libc.so.6
#1  0x40a28064 in obj_name_cmp () from /usr/lib/libcrypto.so.0.9.7
#2  0x40a7fc74 in ?? ()
#3  0x40638c74 in empty.0 () from /usr/lib/libssl.so.0.9.7
#4  0x000000ba in ?? ()
#5  0x080957b6 in driver_alloc (size=139071024) at erl_alloc.h:187
#6  0x03149d58 in ?? ()
#7  0x084a0a20 in ?? ()
#8  0x00008000 in ?? ()
#9  0x084a0e08 in ?? ()
#10 0x40a26e25 in lh_insert () from /usr/lib/libcrypto.so.0.9.7
#11 0x084a0a20 in ?? ()
#12 0x084a0e08 in ?? ()
#13 0xbfffec58 in ?? ()
#14 0x409b290e in OBJ_NAME_add () from /usr/lib/libcrypto.so.0.9.7
#15 0x40a52ce4 in __JCR_LIST__ () from /usr/lib/libcrypto.so.0.9.7
#16 0x00000001 in ?? ()
#17 0x00008000 in ?? ()
#18 0xbfffecb8 in ?? ()
#19 0x409b28c8 in OBJ_NAME_add () from /usr/lib/libcrypto.so.0.9.7
#20 0x080f0ee0 in erts_sys_alloc (t=139070968, x=0x84a0e20, sz=1083910344) at sys/unix/sys.c:3035
Previous frame inner to this frame (corrupt stack?)


A list shows:

3035        void *res = malloc((size_t) sz);
3036    #if HAVE_ERTS_MSEG
3037        if (!res) {
3038            erts_mseg_clear_cache();
3039            return malloc((size_t) sz);
3040        }
3041    #endif
3042        return res;
3043    }


Reproducing this takes around 2-3 tries on average.

Steps:

1. Setup a distributed ejabberd cluster (Node A and Node B). Both nodes have the same tables in mnesia
distributed using the same type (eg route is ram_copies on both nodes, offline_msg disc_only_copies on both etc).

2. Login on both nodes with one client each (tkabber was use in this case), Ca - client on Node A, Cb - client on Node B.

3. Ca chats with Cb.

4. Shutdown Node A.

5. Bring Node A back up.

6. Cb logs off.

7. Cb logs in.

At this point Node B should seg fault. As noted earlier it might require several attempts to obtain the seg fault.

Thanks in advance

totem






More information about the erlang-bugs mailing list