[erlang-questions] Turning on ASSERT's in OTP code.

Paul Davis paul.joseph.davis@REDACTED
Tue Jun 29 17:21:50 CEST 2010


On Mon, Jun 28, 2010 at 5:22 AM, Sverker Eriksson
<sverker@REDACTED> wrote:
> A "debug" build will enable ASSERT's.
>
> From INSTALL.md:
>
> "How to Build a Debug Enabled Erlang RunTime System
> --------------------------------------------------
>
> After completing all the normal building steps described above a debug
> enabled runtime system can be built. To do this you have to change
> directory to `$ERL_TOP/erts/emulator`.
>
> In this directory execute:
>
>   $ make debug FLAVOR=$FLAVOR
>
> where `$FLAVOR` is either `plain` or `smp`. The flavor options will
> produce a beam.debug and beam.smp.debug executable respectively. The
> files are installed along side with the normal (opt) versions `beam.smp`
> and `beam`.
>
> To start the debug enabled runtime system execute:
>
>   $ $ERL_TOP/bin/cerl -debug
>
> The debug enabled runtime system features lock violation checking,
> assert checking and various sanity checks to help a developer ensure
> correctness. Some of these features can be enabled on a normal beam
> using appropriate configure options.
>
> There are other types of runtime systems that can be built as well
> using the similar steps just described.
>
>   $ make $TYPE FLAVOR=$FLAVOR
>
> where `$TYPE` is `opt`, `gcov`, `gprof`, `debug`, `valgrind`, or `lcnt`.
> These different beam types are useful for debugging and profiling
> purposes."
>
> /Sverker, Erlang/OTP
>
>
>
> Paul Joseph Davis wrote:
>>
>> Yeah, this is affecting a specific point in some tests I'm migrating. I
>> haven't been able to narrow down what it is about this specific message
>> that's causing an error but other messages are sent fine. My hope is that by
>> turning on the asserts I can narrow down where exactly I'm going off the
>> reservation.
>> Paul Davis
>>
>> On Jun 28, 2010, at 3:41 AM, Rapsey <rapsey@REDACTED> wrote:
>>
>>
>>>
>>> I'm using a NIF library that sends messages from a thread and have not
>>> seen such crashes. The environment gets invalidated after a send. Are you
>>> calling enif_clear_env or enif_free_env after enif_send?
>>>
>>> Sergej
>>>
>>> On Mon, Jun 28, 2010 at 8:12 AM, Paul Davis <paul.joseph.davis@REDACTED>
>>> wrote:
>>> I've stumbled across a weird segfault when sending terms from a thread
>>> spawned by a NIF. Investigating this i followed a traceback to
>>> erts/emulator/beam/erl_alloc_util.c. Glancing at the code in
>>> erl_alloc_util.c I notice that the line before the crash is an assert.
>>> Checking the value in gdb shows that the assertion is violated. My
>>> first thought was to try and enable these asserts but I'm not sure
>>> where to look for docs on how to do such a thing. Looking through the
>>> code I see them looking for DEBUG to be defined yet nothing in
>>> ./configure's help suggests a flag to set.
>>>
>>> In the interest of trying things I tried CFLAGS="-DDEBUG" ./configure
>>> && CFLAGS="-DDEBUG" make but it dies with an error about pcre.
>>>
>>> Anyone have a pointer on where to look for getting these enabled?
>>>
>>> Here's a copy of the traceback if it helps anyone. The lines in
>>> erl_alloc_util.c starting at 598:
>>>
>>>   ASSERT(crr->prev);
>>>   crr->prev->next = crr->next;
>>>
>>> gdb shows that crr is valid and crr->prev is NULL.
>>>
>>> #0  0x08085e82 in unlink_carrier (allctr=0xa2dc140, blk=<value optimized
>>> out>)
>>>   at beam/erl_alloc_util.c:599
>>> #1  destroy_carrier (allctr=0xa2dc140, blk=<value optimized out>)
>>>   at beam/erl_alloc_util.c:1615
>>> #2  0x08087161 in do_erts_alcu_free (type=148, unused=0x8250b60,
>>> p=0xa2dd7f0)
>>>   at beam/erl_alloc_util.c:2648
>>> #3  erts_alcu_free_thr_pref (type=148, unused=0x8250b60, p=0xa2dd7f0)
>>>   at beam/erl_alloc_util.c:2699
>>> #4  0x081535da in erts_free (env=0xa2dd7f0) at beam/erl_alloc.h:210
>>> #5  enif_free_env (env=0xa2dd7f0) at beam/erl_nif.c:256
>>> #6  0x00e21218 in job_destroy (obj=0xa2dd7c0) at c_src/job.c:32
>>> #7  0x00e21cda in queue_done (queue=0xa2dd718, item=0xa2dd7c0)
>>>   at c_src/queue.c:184
>>> #8  0x00e23849 in worker_exec (vm=0xa2a7d10) at c_src/worker.c:124
>>> #9  0x00e2376d in worker_run (arg=0xa2dd628) at c_src/worker.c:85
>>> #10 0x081d2a2c in thr_wrapper (vtwd=0xb64e5020) at common/ethread.c:480
>>> #11 0x0068296e in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
>>> #12 0x00260a4e in clone () from /lib/tls/i686/cmov/libc.so.6
>>>
>>>
>>> Thanks,
>>> Paul Davis
>>>
>>> ________________________________________________________________
>>> erlang-questions (at) erlang.org mailing list.
>>> See http://www.erlang.org/faq.html
>>> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>>>
>>>
>>>
>>
>>
>>
>
>

Thanks for everyone's input. I managed to get the debug version built
last night and tested with my code. Unfortunately I've managed to
completely remove any trace of the bug. I've run it through valgrind
as well as with duma to try and track down any memory issues to no
avail.

There was an issue with building the valgrind version of the emulator
though. Linking failed with an error about a missing definition for
something like "erts_fpu_interrupt". I can't remember the exact third
part of the function name. If I figure that out tonight I'll send a
report.

Thanks,
Paul Davis


More information about the erlang-questions mailing list