<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    Thanks Lukas and Peti, that's great. "erl -emu_type debug"
    definitely works - I haven't made the debug build yet but I get
    "erlexec: The emulator
    '/usr/local/Cellar/erlang/20.3.4/lib/erlang/erts-9.3/bin/beam.debug.smp'
    does not exist", which is what I want. I'll get onto the debug build
    and see what I can find out.<br>
    <br>
    In case anyone else wants to use that in rebar3 shell, I found
    <a class="moz-txt-link-freetext" href="http://www.rebar3.org/v3.0/discuss/5745fb105528582000dfb47f">http://www.rebar3.org/v3.0/discuss/5745fb105528582000dfb47f</a> which
    shows you can set ERL_FLAGS to just set -emu_type directly, or
    specify vm.args so you can then set it in there, e.g.:<br>
    <br>
    <blockquote type="cite">ERL_FLAGS=" -args_file config/vm.args
      -config config/sys.config" rebar3 shell</blockquote>
    <br>
    Cheers,<br>
    Igor<br>
    <br>
    <div class="moz-cite-prefix">On 29/05/2018 14:09, Peti Gömöri wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAEhaAyGNh6aT7VwY0bnG32W2xoLTfTk-Vm9zXnGXT-R0H4AidQ@mail.gmail.com">
      <div dir="ltr">since OTP 20 the <span class="gmail-code" style=""><b>-emu_type</b> flag
          might also work eg.:</span>
        <div><span class="gmail-code" style="">  erl -emu_type debug</span></div>
        <div><span class="gmail-code" style=""><br>
          </span></div>
        <div><span class="gmail-code" style="">and you can put it in the
            vm.args file too</span></div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Tue, May 29, 2018 at 2:45 PM, Lukas
          Larsson <span dir="ltr"><<a href="mailto:lukas@erlang.org"
              target="_blank" moz-do-not-send="true">lukas@erlang.org</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div dir="ltr">I don't know how to make rebar3 run the debug
              emulator, but a quick and dirty trick that I do when all
              else fails is to copy the beam.debug.smp file over the
              beam.smp file.
              <div><br>
              </div>
              <div>You probably also have to copy the
                erl_child_setup.debug file, that file should however
                have the .debug suffix remaining. So:</div>
              <div><br>
              </div>
              <div>cp bin/`erts/autoconf/config.<wbr>guess`/beam.debug.smp
                path/to/release/erts-v.s.n/<wbr>bin/beam.smp</div>
              <div>cp bin/`erts/autoconf/config.<wbr>guess`/erl_child_setup.debug
                path/to/release/erts-v.s.n/<wbr>bin/
                <div>
                  <div class="h5"><br>
                    <div class="gmail_extra"><br>
                      <div class="gmail_quote">On Tue, May 29, 2018 at
                        1:30 PM, Igor Clark <span dir="ltr"><<a
                            href="mailto:igor.clark@gmail.com"
                            target="_blank" moz-do-not-send="true">igor.clark@gmail.com</a>></span>
                        wrote:<br>
                        <blockquote class="gmail_quote"
                          style="margin:0px 0px 0px
                          0.8ex;border-left:1px solid
                          rgb(204,204,204);padding-left:1ex">
                          <div bgcolor="#FFFFFF"> Thanks very much
                            Lukas, I think the debug emulator could be
                            what I'm looking for. The NIF only sometimes
                            crashes on lists:member/2 - those log lines
                            are all from different crashes (there's only
                            one crashed thread each time), and sometimes
                            it just crashes on process_main. So I think
                            I might need the debug emulator to trace
                            further.<br>
                            <br>
                            However I have a lot to learn about how to
                            integrate C tooling with something so
                            complex. When I run the debug emulator, does
                            it just show more detailed info in stack
                            traces, or will I need to attach gdb/lldb
                            etc to find out what's going on? Is there
                            any more info on how to set this all up?<br>
                            <br>
                            Also, not 100% sure how to run it, as I run
                            my app with "rebar3 shell" from a release
                            layout during development, or the same
                            inside the NIF-specific app when trying to
                            track problems down there. The doc you
                            linked says:<br>
                            <br>
                            <blockquote type="cite">
                              <p>To start the debug enabled runtime
                                system execute:</p>
                              <pre style="box-sizing:border-box;font-family:SFMono-Regular,Consolas,"Liberation Mono",Menlo,Courier,monospace;font-size:13.6px;margin-top:0px;margin-bottom:16px;word-wrap:normal;padding:16px;overflow:auto;line-height:1.45;background-color:rgb(246,248,250);border-radius:3px;color:rgb(36,41,46);font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration:none"><code style="box-sizing:border-box;font-family:SFMono-Regular,Consolas,"Liberation Mono",Menlo,Courier,monospace;font-size:13.6px;padding:0px;margin:0px;background-color:transparent;border-radius:3px;word-break:normal;white-space:pre-wrap;border:0px;display:inline;overflow:visible;line-height:inherit;word-wrap:normal">$ $ERL_TOP/bin/cerl -debug</code></pre>
                            </blockquote>
                            <br>
                            I realise these are more rebar3 than erlang
                            questions, but I can't find much in the
                            rebar3 docs about them:<br>
                            <br>
                            - How should I specify that rebar3 should
                            run "cerl" instead of "erl" ?<br>
                            <br>
                            - Should I just add "-debug" in my
                            "config/vm.args" or is there another way to
                            do this?<br>
                            <br>
                            Thank you for your help!<span
                              class="m_-8160118314680844088gmail-HOEnZb"><font
                                color="#888888"><br>
                                i</font></span>
                            <div>
                              <div
                                class="m_-8160118314680844088gmail-h5"><br>
                                <br>
                                <div
                                  class="m_-8160118314680844088gmail-m_-599898042086351337moz-cite-prefix">On
                                  29/05/2018 11:30, Lukas Larsson wrote:<br>
                                </div>
                                <blockquote type="cite">
                                  <div dir="ltr">Have you tried to run
                                    your code in a debug emulator? <a
href="https://github.com/erlang/otp/blob/master/HOWTO/INSTALL.md#how-to-build-a-debug-enabled-erlang-runtime-system"
                                      target="_blank"
                                      moz-do-not-send="true">https://github.com/e<wbr>rlang/otp/blob/master/HOWTO/IN<wbr>STALL.md#how-to-build-a-debug-<wbr>enabled-erlang-runtime-system</a>
                                    <div><br>
                                    </div>
                                    <div>Since it seems to be
                                      segfaulting in lists:member/2, I
                                      would guess that your nif somehow
                                      builds an invalid list that later
                                      is used by lists:member/2.</div>
                                  </div>
                                  <div class="gmail_extra"><br>
                                    <div class="gmail_quote">On Tue, May
                                      29, 2018 at 11:04 AM, Igor Clark <span
                                        dir="ltr"><<a
                                          href="mailto:igor.clark@gmail.com"
                                          target="_blank"
                                          moz-do-not-send="true">igor.clark@gmail.com</a>></span>
                                      wrote:<br>
                                      <blockquote class="gmail_quote"
                                        style="margin:0px 0px 0px
                                        0.8ex;border-left:1px solid
                                        rgb(204,204,204);padding-left:1ex">Thanks
                                        Sergej - that's where I got the
                                        thread reports I pasted in
                                        below, from e.g.
                                        'beam.smp_2018-05-28-212735_Ig<wbr>or-Clarks-iMac.crash'.<br>
                                        <br>
                                        Each log says the only crashed
                                        thread was a scheduler thread,
                                        for example "8_scheduler"
                                        running "process_main" in the
                                        case of the first one below.
                                        This is how I tracked down a
                                        bunch of errors in my own code,
                                        but the only ones that still
                                        happen are in the scheduler,
                                        according to the Console crash
                                        logs.<br>
                                        <br>
                                        The thing is, it seems really
                                        unlikely that a VM running my
                                        NIF code would just happen to be
                                        crashing in the scheduler rather
                                        than my code(!) - so that's what
                                        I'm trying to work out, how to
                                        find out what's actually going
                                        on, given that the log tells me
                                        the crashed thread is running
                                        "process_main" or
                                        'lists_member_2'.<br>
                                        <br>
                                        Any suggestions welcome!<br>
                                        <br>
                                        Cheers,<br>
                                        Igor
                                        <div
                                          class="m_-8160118314680844088gmail-m_-599898042086351337HOEnZb">
                                          <div
                                            class="m_-8160118314680844088gmail-m_-599898042086351337h5"><br>
                                            <br>
                                            On 29/05/2018 04:16, Sergej
                                            Jurečko wrote:<br>
                                            <blockquote
                                              class="gmail_quote"
                                              style="margin:0px 0px 0px
                                              0.8ex;border-left:1px
                                              solid
                                              rgb(204,204,204);padding-left:1ex">
                                              On macOS there is a quick
                                              way to get a stack trace
                                              if you compiled with debug
                                              symbols.<br>
                                              Open
                                              /Applications/Utilities/Consol<wbr>e<br>
                                              Go to: User Reports<br>
                                              <br>
                                              You will see beam.smp in
                                              there if it crashed. Click
                                              on it and you get a report
                                              what every thread was
                                              calling at the time of
                                              crash.<br>
                                              <br>
                                              <br>
                                              Regards,<br>
                                              Sergej<br>
                                              <br>
                                              <blockquote
                                                class="gmail_quote"
                                                style="margin:0px 0px
                                                0px
                                                0.8ex;border-left:1px
                                                solid
                                                rgb(204,204,204);padding-left:1ex">
                                                On 28 May 2018, at
                                                23:46, Igor Clark <<a
href="mailto:igor.clark@gmail.com" target="_blank"
                                                  moz-do-not-send="true">igor.clark@gmail.com</a>>
                                                wrote:<br>
                                                <br>
                                                Hi folks, hope all well,<br>
                                                <br>
                                                I have a NIF which very
                                                occasionally segfaults,
                                                intermittently and
                                                apparently
                                                unpredictably, bringing
                                                down the VM. I've spent
                                                a bunch of time tracing
                                                allocation and
                                                dereferencing problems
                                                in my NIF code, and I've
                                                got rid of what seems
                                                like 99%+ of the
                                                problems - but it still
                                                occasionally happens,
                                                and I'm having trouble
                                                tracing further, because
                                                the crash logs show the
                                                crashed threads as doing
                                                things like these: (each
                                                one taken from a
                                                separate log where it's
                                                the only crashed thread)<br>
                                                <br>
                                                <br>
                                                <blockquote
                                                  class="gmail_quote"
                                                  style="margin:0px 0px
                                                  0px
                                                  0.8ex;border-left:1px
                                                  solid
                                                  rgb(204,204,204);padding-left:1ex">
                                                  Thread 40 Crashed::
                                                  8_scheduler<br>
                                                  0   beam.smp         
                                                                 
                                                  0x000000001c19980b
                                                  process_main + 1570<br>
                                                  <br>
                                                  Thread 5 Crashed::
                                                  3_scheduler<br>
                                                  0   beam.smp         
                                                                 
                                                  0x000000001c01d80b
                                                  process_main + 1570<br>
                                                  <br>
                                                  Thread 7 Crashed::
                                                  5_scheduler<br>
                                                  0   beam.smp         
                                                                 
                                                  0x000000001baff0b8
                                                  lists_member_2 + 63<br>
                                                  <br>
                                                  Thread 3 Crashed::
                                                  1_scheduler<br>
                                                  0   beam.smp         
                                                                 
                                                  0x000000001d4b780b
                                                  process_main + 1570<br>
                                                  <br>
                                                  Thread 5 Crashed::
                                                  3_scheduler<br>
                                                  0   beam.smp         
                                                                 
                                                  0x000000001fcf280b
                                                  process_main + 1570<br>
                                                  <br>
                                                  Thread 6 Crashed::
                                                  4_scheduler<br>
                                                  0   beam.smp         
                                                                 
                                                  0x000000001ae290b8
                                                  lists_member_2 + 63<br>
                                                </blockquote>
                                                <br>
                                                I'm very confident that
                                                the problems are in my
                                                code, not in the
                                                scheduler ;-) But
                                                without more detail, I
                                                don't know how to trace
                                                where they're happening.
                                                When they do, there are
                                                sometimes other threads
                                                doing things in my code
                                                (maybe 20% of the time)
                                                - but mostly not, and on
                                                the occasions when they
                                                are, I've been unable to
                                                see what the problem
                                                might be on the lines
                                                referenced.<br>
                                                <br>
                                                It seems like it's some
                                                kind of cross-thread
                                                data access issue, but I
                                                don't know how to track
                                                it down.<br>
                                                <br>
                                                Some more context about
                                                what's going on. My NIF
                                                load() function starts a
                                                thread which passes a
                                                callback function to a
                                                library that talks to
                                                some hardware, which
                                                calls the callback when
                                                it has a message. It's a
                                                separate thread because
                                                the library only calls
                                                back to the thread that
                                                initialized it; when I
                                                ran it directly in NIF
                                                load(), it didn't call
                                                back, but in the
                                                VM-managed thread, it
                                                works as expected. The
                                                thread sits and waits
                                                for stuff to happen, and
                                                callbacks come when they
                                                should.<br>
                                                <br>
                                                I use
                                                enif_thread_create/enif_thread<wbr>_opts_create
                                                to start the thread, and
                                                use enif_alloc/enif_free
                                                everywhere. I keep a
                                                static pointer in the
                                                NIF to a couple of
                                                members of the state
                                                struct, as that seems
                                                the only way to
                                                reference them in the
                                                callback function. The
                                                struct is kept in NIF
                                                private data: I pass
                                                **priv from load() to
                                                the thread_main
                                                function, allocate the
                                                state struct using
                                                enif_alloc in
                                                thread_main, and set
                                                priv pointing to the
                                                state struct, also in
                                                the thread. Other NIF
                                                functions do access
                                                members of the state
                                                struct, but only ever
                                                through enif_priv_data(
                                                env ).<br>
                                                <br>
                                                The vast majority of the
                                                time it all works
                                                perfectly, humming along
                                                very nicely, but every
                                                now and then, without
                                                any real pattern I can
                                                see, it just segfaults
                                                and the VM comes down.
                                                It's only happened 3
                                                times in the last 20+
                                                hours of working on the
                                                app, testing &
                                                running all the while,
                                                doing VM starts, stops,
                                                code reloads, etc. But
                                                when it happens, it's
                                                kind of a showstopper,
                                                and I'd really like to
                                                nail it down.<br>
                                                <br>
                                                This is all happening in
                                                Erlang 20.3.4 on MacOS
                                                10.12.6 / Apple LLVM
                                                version 9.0.0
                                                (clang-900.0.38).<br>
                                                <br>
                                                Any ideas on how/where
                                                to look next to try to
                                                track this down? Hope
                                                it's not something
                                                structural in the above
                                                which just won't work.<br>
                                                <br>
                                                Cheers,<br>
                                                Igor<br>
                                                <br>
                                                <br>
______________________________<wbr>_________________<br>
                                                erlang-questions mailing
                                                list<br>
                                                <a
                                                  href="mailto:erlang-questions@erlang.org"
                                                  target="_blank"
                                                  moz-do-not-send="true">erlang-questions@erlang.org</a><br>
                                                <a
                                                  href="http://erlang.org/mailman/listinfo/erlang-questions"
                                                  rel="noreferrer"
                                                  target="_blank"
                                                  moz-do-not-send="true">http://erlang.org/mailman/list<wbr>info/erlang-questions</a><br>
                                              </blockquote>
                                            </blockquote>
                                            <br>
______________________________<wbr>_________________<br>
                                            erlang-questions mailing
                                            list<br>
                                            <a
                                              href="mailto:erlang-questions@erlang.org"
                                              target="_blank"
                                              moz-do-not-send="true">erlang-questions@erlang.org</a><br>
                                            <a
                                              href="http://erlang.org/mailman/listinfo/erlang-questions"
                                              rel="noreferrer"
                                              target="_blank"
                                              moz-do-not-send="true">http://erlang.org/mailman/list<wbr>info/erlang-questions</a><br>
                                          </div>
                                        </div>
                                      </blockquote>
                                    </div>
                                    <br>
                                  </div>
                                </blockquote>
                                <br>
                              </div>
                            </div>
                          </div>
                          <br>
                          ______________________________<wbr>_________________<br>
                          erlang-questions mailing list<br>
                          <a href="mailto:erlang-questions@erlang.org"
                            target="_blank" moz-do-not-send="true">erlang-questions@erlang.org</a><br>
                          <a
                            href="http://erlang.org/mailman/listinfo/erlang-questions"
                            rel="noreferrer" target="_blank"
                            moz-do-not-send="true">http://erlang.org/mailman/list<wbr>info/erlang-questions</a><br>
                          <br>
                        </blockquote>
                      </div>
                      <br>
                    </div>
                  </div>
                </div>
              </div>
            </div>
            <br>
            ______________________________<wbr>_________________<br>
            erlang-questions mailing list<br>
            <a href="mailto:erlang-questions@erlang.org"
              moz-do-not-send="true">erlang-questions@erlang.org</a><br>
            <a
              href="http://erlang.org/mailman/listinfo/erlang-questions"
              rel="noreferrer" target="_blank" moz-do-not-send="true">http://erlang.org/mailman/<wbr>listinfo/erlang-questions</a><br>
            <br>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </body>
</html>