[erlang-questions] erlang node core dumps in erl_bestfit_alloc
Lukas Larsson
garazdawi@REDACTED
Mon Jun 30 18:02:50 CEST 2014
Hello,
99% of the time when you see a memory corruption error in the erlang
allocators it is because a nif/linked-in driver is miss managing memory it
got from driver/nif_alloc. To track down the error I would either build a
debug or valgrind enabled emulator and see if that shows anything. For a
description on how to build a debug emulator see the INSTALL[1] howto. If
you want to build a valgrind emulator just substitute debug with valgrind
in the guide and it should work.
Lukas
[1]:
https://github.com/erlang/otp/blob/master/HOWTO/INSTALL.md#how-to-build-a-debug-enabled-erlang-runtime-system
On Mon, Jun 30, 2014 at 12:08 PM, Puneet Ahuja <puneet@REDACTED> wrote:
> Hi,
>
>
> We are frequently (every two days or so) encountering an issue where an
> erlang node crashes with the following backtrace, the same crash is
> replicated on more than one machine (with the same backtrace):
>
> Core was generated by `/usr/local/lib/erlang/erts-5.10.4/bin/beam.smp
> -zdbbl 20000 -swt very_low -sbt'.
> Program terminated with signal 11, Segmentation fault.
> #0 replace (allctr=0x130d400, size=296, cand_blk=<value optimized out>,
> cand_size=<value optimized out>) at beam/erl_bestfit_alloc.c:287
> 287 else if (x == x->parent->left)
> #0 replace (allctr=0x130d400, size=296, cand_blk=<value optimized out>,
> cand_size=<value optimized out>) at beam/erl_bestfit_alloc.c:287
> #1 bf_unlink_free_block (allctr=0x130d400, size=296, cand_blk=<value
> optimized out>, cand_size=<value optimized out>) at
> beam/erl_bestfit_alloc.c:797
> #2 bf_get_free_block (allctr=0x130d400, size=296, cand_blk=<value
> optimized out>, cand_size=<value optimized out>) at
> beam/erl_bestfit_alloc.c:857
> #3 0x0000000000443106 in mbc_alloc_block (allctr=0x130d400, size=287) at
> beam/erl_alloc_util.c:1956
> #4 mbc_alloc (allctr=0x130d400, size=287) at beam/erl_alloc_util.c:2085
> #5 0x0000000000448963 in erts_alcu_alloc_thr_pref (type=170, extra=<value
> optimized out>, size=287) at beam/erl_alloc_util.c:4932
> #6 0x00000000005099a8 in erts_alloc (type=<value optimized out>,
> size=287) at beam/erl_alloc.h:223
> #7 0x0000000000509b7e in erts_bin_nrml_alloc (c_p=0x7ff5e3d85e48,
> reg=<value optimized out>, live=<value optimized out>,
> build_size_term=<value optimized out>, extra_words=<value optimized out>,
> unit=<value optimized out>) at beam/erl_binary.h:260
> #8 erts_bs_append (c_p=0x7ff5e3d85e48, reg=<value optimized out>,
> live=<value optimized out>, build_size_term=<value optimized out>,
> extra_words=<value optimized out>, unit=<value optimized out>) at
> beam/erl_bits.c:1373
> #9 0x000000000053d510 in process_main () at beam/beam_emu.c:3843
> #10 0x000000000049dc8b in sched_thread_func (vesdp=0x7ff5e30168c0) at
> beam/erl_process.c:5801
> #11 0x00000000005bda96 in thr_wrapper (vtwd=0x7fffc11bd510) at
> pthread/ethread.c:106
> #12 0x0000003883c079d1 in start_thread () from /lib64/libpthread.so.0
> #13 0x00000038838e8b6d in clone () from /lib64/libc.so.6
>
>
> The erlang otp version is R16B03-1 built from the source code without any
> customisation with respect to the configure script parameters. The exmpp
> library we depend on uses port drivers. We don’t encounter this problem in
> the nodes running on R15.
> The source code at the crash point seems to point to some kind of memory
> corruption in the erlang acquired memory in erl_bestfit_alloc. Any pointers
> on how to resolve this problem would help.
>
> Linux Kernel Version: 2.6.32-431.3.1.el6.x86_64
> Linux Flavor: Centos 6.5
>
> The crash occurs during minimal load conditions.
>
> Additionally gdb shows following libraries from the core dump:
>
> *
> * Libraries
> *
> From To Syms Read Shared Object Library
> 0x0000003885400e10 0x0000003885401688 Yes (*) /lib64/libutil.so.1
> 0x0000003dd0c00de0 0x0000003dd0c01998 Yes (*) /lib64/libdl.so.2
> 0x0000003884403e70 0x0000003884443fb8 Yes (*) /lib64/libm.so.6
> 0x00000039cac06a30 0x00000039cac1cf88 Yes (*) /lib64/libncurses.so.5
> 0x0000003883c05760 0x0000003883c110c8 Yes (*) /lib64/libpthread.so.0
> 0x0000003884002140 0x00000038840054f8 Yes (*) /lib64/librt.so.1
> 0x000000388381ea60 0x000000388394024c Yes (*) /lib64/libc.so.6
> 0x00000039cc00c840 0x00000039cc015c08 Yes (*) /lib64/libtinfo.so.5
> 0x0000003883000b00 0x00000038830198eb Yes (*)
> /lib64/ld-linux-x86-64.so.2
> 0x00007ff5d9a873c0 0x00007ff5d9a8e2c8 Yes
> /usr/local/lib/erlang/lib/crypto-3.2/priv/lib/crypto.so
> 0x00007ff5d970abc0 0x00007ff5d97fd9a8 Yes (*)
> /usr/lib64/libcrypto.so.10
> 0x0000003884802120 0x000000388480d3a8 Yes (*) /lib64/libz.so.1
> 0x00007ff5d94a1900 0x00007ff5d94a1bf8 Yes
> /usr/local/lib/erlang/lib/crypto-3.2/priv/lib/crypto_callback.so
> 0x00007ff5d9272f00 0x00007ff5d9276708 Yes
> /usr/local/lib/erlang/lib/exmpp-0.9.9-10-g3d17ff4/priv/lib/exmpp_stringprep.so
> 0x00007ff5d8c38fa0 0x00007ff5d8c3dbf8 Yes
> /usr/local/lib/erlang/lib/exmpp-0.9.9-10-g3d17ff4/priv/lib/exmpp_xml_expat.so
> 0x0000003886403cd0 0x000000388641cc88 Yes (*) /lib64/libexpat.so.1
> 0x00007ff5d8a2ef00 0x00007ff5d8a33948 Yes
> /usr/local/lib/erlang/lib/exmpp-0.9.9-10-g3d17ff4/priv/lib/exmpp_xml_expat_legacy.so
> 0x00007ff5d8824f50 0x00007ff5d8829a28 Yes
> /usr/local/lib/erlang/lib/exmpp-0.9.9-10-g3d17ff4/priv/lib/exmpp_xml_libxml2.so
> 0x0000003cc562c7c0 0x0000003cc5709498 Yes (*) /usr/lib64/libxml2.so.2
> 0x00007ff5d3979950 0x00007ff5d397db18 Yes
> /usr/local/lib/erlang/lib/exmpp-0.9.9-10-g3d17ff4/priv/lib/exmpp_tls_openssl.so
> 0x00007ff5d34d0180 0x00007ff5d350a938 Yes (*) /usr/lib64/libssl.so.10
> 0x0000003cc3a0ac30 0x0000003cc3a38728 Yes (*)
> /lib64/libgssapi_krb5.so.2
> 0x0000003cc4e1b430 0x0000003cc4e94a78 Yes (*) /lib64/libkrb5.so.3
> 0x00007ff5d32b53f0 0x00007ff5d32b5fc8 Yes (*) /lib64/libcom_err.so.2
> 0x0000003cc36043d0 0x0000003cc361d5a8 Yes (*) /lib64/libk5crypto.so.3
> 0x0000003cc4a02a40 0x0000003cc4a080c8 Yes (*)
> /lib64/libkrb5support.so.0
> 0x0000003cc3e00bf0 0x0000003cc3e011d8 Yes (*) /lib64/libkeyutils.so.1
> 0x0000003886003930 0x0000003886012938 Yes (*) /lib64/libresolv.so.2
> 0x0000003cc1a05850 0x0000003cc1a15cc8 Yes (*) /lib64/libselinux.so.1
> 0x00007ff5d30af5d0 0x00007ff5d30b2a28 Yes
> /usr/local/lib/erlang/lib/exmpp-0.9.9-10-g3d17ff4/priv/lib/exmpp_compress_zlib.so
>
> Regards,
> Puneet
>
>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140630/7e434d8a/attachment.htm>
More information about the erlang-questions
mailing list