[erlang-questions] erlang node core dumps in erl_bestfit_alloc

Lukas Larsson garazdawi@REDACTED
Mon Jun 30 18:02:50 CEST 2014


Hello,

99% of the time when you see a memory corruption error in the erlang
allocators it is because a nif/linked-in driver is miss managing memory it
got from driver/nif_alloc. To track down the error I would either build a
debug or valgrind enabled emulator and see if that shows anything. For a
description on how to build a debug emulator see the INSTALL[1] howto. If
you want to build a valgrind emulator just substitute debug with valgrind
in the guide and it should work.

Lukas

   [1]:
https://github.com/erlang/otp/blob/master/HOWTO/INSTALL.md#how-to-build-a-debug-enabled-erlang-runtime-system


On Mon, Jun 30, 2014 at 12:08 PM, Puneet Ahuja <puneet@REDACTED> wrote:

> Hi,
>
>
> We are frequently (every two days or so) encountering an issue where an
> erlang node crashes with the following backtrace, the same crash is
> replicated on more than one machine (with the same backtrace):
>
> Core was generated by `/usr/local/lib/erlang/erts-5.10.4/bin/beam.smp
> -zdbbl 20000 -swt very_low -sbt'.
> Program terminated with signal 11, Segmentation fault.
> #0  replace (allctr=0x130d400, size=296, cand_blk=<value optimized out>,
> cand_size=<value optimized out>) at beam/erl_bestfit_alloc.c:287
> 287         else if (x == x->parent->left)
> #0  replace (allctr=0x130d400, size=296, cand_blk=<value optimized out>,
> cand_size=<value optimized out>) at beam/erl_bestfit_alloc.c:287
> #1  bf_unlink_free_block (allctr=0x130d400, size=296, cand_blk=<value
> optimized out>, cand_size=<value optimized out>) at
> beam/erl_bestfit_alloc.c:797
> #2  bf_get_free_block (allctr=0x130d400, size=296, cand_blk=<value
> optimized out>, cand_size=<value optimized out>) at
> beam/erl_bestfit_alloc.c:857
> #3  0x0000000000443106 in mbc_alloc_block (allctr=0x130d400, size=287) at
> beam/erl_alloc_util.c:1956
> #4  mbc_alloc (allctr=0x130d400, size=287) at beam/erl_alloc_util.c:2085
> #5  0x0000000000448963 in erts_alcu_alloc_thr_pref (type=170, extra=<value
> optimized out>, size=287) at beam/erl_alloc_util.c:4932
> #6  0x00000000005099a8 in erts_alloc (type=<value optimized out>,
> size=287) at beam/erl_alloc.h:223
> #7  0x0000000000509b7e in erts_bin_nrml_alloc (c_p=0x7ff5e3d85e48,
> reg=<value optimized out>, live=<value optimized out>,
> build_size_term=<value optimized out>, extra_words=<value optimized out>,
> unit=<value optimized out>) at beam/erl_binary.h:260
> #8  erts_bs_append (c_p=0x7ff5e3d85e48, reg=<value optimized out>,
> live=<value optimized out>, build_size_term=<value optimized out>,
> extra_words=<value optimized out>, unit=<value optimized out>) at
> beam/erl_bits.c:1373
> #9  0x000000000053d510 in process_main () at beam/beam_emu.c:3843
> #10 0x000000000049dc8b in sched_thread_func (vesdp=0x7ff5e30168c0) at
> beam/erl_process.c:5801
> #11 0x00000000005bda96 in thr_wrapper (vtwd=0x7fffc11bd510) at
> pthread/ethread.c:106
> #12 0x0000003883c079d1 in start_thread () from /lib64/libpthread.so.0
> #13 0x00000038838e8b6d in clone () from /lib64/libc.so.6
>
>
> The erlang otp version is R16B03-1 built from the source code without any
> customisation with respect to the configure script parameters. The exmpp
> library we depend on uses port drivers.  We don’t encounter this problem in
> the nodes running on R15.
> The source code at the crash point seems to point to some kind of memory
> corruption in the erlang acquired memory in erl_bestfit_alloc. Any pointers
> on how to resolve this problem would help.
>
> Linux Kernel Version: 2.6.32-431.3.1.el6.x86_64
> Linux Flavor: Centos 6.5
>
> The crash occurs during minimal load conditions.
>
> Additionally gdb shows following libraries from the core dump:
>
> *
> * Libraries
> *
> From                To                  Syms Read   Shared Object Library
> 0x0000003885400e10  0x0000003885401688  Yes (*)     /lib64/libutil.so.1
> 0x0000003dd0c00de0  0x0000003dd0c01998  Yes (*)     /lib64/libdl.so.2
> 0x0000003884403e70  0x0000003884443fb8  Yes (*)     /lib64/libm.so.6
> 0x00000039cac06a30  0x00000039cac1cf88  Yes (*)     /lib64/libncurses.so.5
> 0x0000003883c05760  0x0000003883c110c8  Yes (*)     /lib64/libpthread.so.0
> 0x0000003884002140  0x00000038840054f8  Yes (*)     /lib64/librt.so.1
> 0x000000388381ea60  0x000000388394024c  Yes (*)     /lib64/libc.so.6
> 0x00000039cc00c840  0x00000039cc015c08  Yes (*)     /lib64/libtinfo.so.5
> 0x0000003883000b00  0x00000038830198eb  Yes (*)
> /lib64/ld-linux-x86-64.so.2
> 0x00007ff5d9a873c0  0x00007ff5d9a8e2c8  Yes
> /usr/local/lib/erlang/lib/crypto-3.2/priv/lib/crypto.so
> 0x00007ff5d970abc0  0x00007ff5d97fd9a8  Yes (*)
> /usr/lib64/libcrypto.so.10
> 0x0000003884802120  0x000000388480d3a8  Yes (*)     /lib64/libz.so.1
> 0x00007ff5d94a1900  0x00007ff5d94a1bf8  Yes
> /usr/local/lib/erlang/lib/crypto-3.2/priv/lib/crypto_callback.so
> 0x00007ff5d9272f00  0x00007ff5d9276708  Yes
> /usr/local/lib/erlang/lib/exmpp-0.9.9-10-g3d17ff4/priv/lib/exmpp_stringprep.so
> 0x00007ff5d8c38fa0  0x00007ff5d8c3dbf8  Yes
> /usr/local/lib/erlang/lib/exmpp-0.9.9-10-g3d17ff4/priv/lib/exmpp_xml_expat.so
> 0x0000003886403cd0  0x000000388641cc88  Yes (*)     /lib64/libexpat.so.1
> 0x00007ff5d8a2ef00  0x00007ff5d8a33948  Yes
> /usr/local/lib/erlang/lib/exmpp-0.9.9-10-g3d17ff4/priv/lib/exmpp_xml_expat_legacy.so
> 0x00007ff5d8824f50  0x00007ff5d8829a28  Yes
> /usr/local/lib/erlang/lib/exmpp-0.9.9-10-g3d17ff4/priv/lib/exmpp_xml_libxml2.so
> 0x0000003cc562c7c0  0x0000003cc5709498  Yes (*)     /usr/lib64/libxml2.so.2
> 0x00007ff5d3979950  0x00007ff5d397db18  Yes
> /usr/local/lib/erlang/lib/exmpp-0.9.9-10-g3d17ff4/priv/lib/exmpp_tls_openssl.so
> 0x00007ff5d34d0180  0x00007ff5d350a938  Yes (*)     /usr/lib64/libssl.so.10
> 0x0000003cc3a0ac30  0x0000003cc3a38728  Yes (*)
> /lib64/libgssapi_krb5.so.2
> 0x0000003cc4e1b430  0x0000003cc4e94a78  Yes (*)     /lib64/libkrb5.so.3
> 0x00007ff5d32b53f0  0x00007ff5d32b5fc8  Yes (*)     /lib64/libcom_err.so.2
> 0x0000003cc36043d0  0x0000003cc361d5a8  Yes (*)     /lib64/libk5crypto.so.3
> 0x0000003cc4a02a40  0x0000003cc4a080c8  Yes (*)
> /lib64/libkrb5support.so.0
> 0x0000003cc3e00bf0  0x0000003cc3e011d8  Yes (*)     /lib64/libkeyutils.so.1
> 0x0000003886003930  0x0000003886012938  Yes (*)     /lib64/libresolv.so.2
> 0x0000003cc1a05850  0x0000003cc1a15cc8  Yes (*)     /lib64/libselinux.so.1
> 0x00007ff5d30af5d0  0x00007ff5d30b2a28  Yes
> /usr/local/lib/erlang/lib/exmpp-0.9.9-10-g3d17ff4/priv/lib/exmpp_compress_zlib.so
>
> Regards,
> Puneet
>
>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140630/7e434d8a/attachment.htm>


More information about the erlang-questions mailing list