<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Hey again - just closing the loop here. I had another even less
frequent bug that was still triggering erts_exit(ERTS_ABORT_EXIT,
...), only happening once in a blue moon, so I fired up guard malloc
with the "+Mea min" option, and this time not only did it give an
immediate crash, but it also gave me the exact line number :-))<br>
<br>
I can't tell you how useful this is - I've had these intermittent
problems only showing up every now and then for a good while, with
no real way to track them down, and so they've been grumbling at the
back of my mind all that time. So the combination of the erl option
and libgmalloc is just an amazing tool for me to hunt down this kind
of issue. Thanks so much for telling me about it!<br>
<br>
Very best,<br>
Igor<br>
<br>
<div class="moz-cite-prefix">On 30/05/2018 10:19, Igor Clark wrote:<br>
</div>
<blockquote type="cite"
cite="mid:ab760066-3deb-b103-514b-e50737730baf@gmail.com">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Thanks Dominic - I don't want to count my chickens before they've
hatched, but it looks like guard malloc has pointed me to at least
some bugs even without that VM option. Even though I wasn't
getting a line number in the stack trace, it was already seeming
to make the NIF crash immediately and consistently, so I was able
to use a ton of debug print statements to track down two problems
that I hadn't been able to see before. (One was an enif_alloc() in
the wrong place, and another seems to have been accessing a
pointer from a function in a shared object file, oops.) No way
would I have seen them without guard malloc showing me the way,
it's a powerful tool :-)<br>
<br>
So I fixed those two, and right now the app is running as expected
without crashes under guard malloc. I'm pretty sure that I'll come
up against more illegal-access bugs over time, so I'm adding "+Mea
min" to the list of options to use when I find the next one. Thank
you.<br>
<br>
Thanks very much also to everyone who replied, particularly Scott
for the guard malloc suggestion & help, and Fred & Tristan
for the rebar3 tips so I could add the necessary CLI options and
track down what was going on. I'm very glad to have been able to
ask such experienced folks for advice, and to have learned about
some *extremely* useful new stuff.<br>
<br>
Cheers,<br>
Igor<br>
<br>
<br>
<div class="moz-cite-prefix">On 29/05/2018 23:58, Dominic Morneau
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CALvPfko7GH2Fwn8DewCGiZf8oiQdOF5ynheQx7r+=GvMSMJNRQ@mail.gmail.com">
<div>
<div>
<div dir="auto">Can you give it a try with <span style="white-space:pre-wrap;background-color:rgb(255,255,255)">"+Mea min" in erl options? This should make Erlang fall back to malloc for all allocators, hopefully making guard malloc more effective.</span></div>
<div dir="auto"><br>
</div>
<div dir="auto">Dominic</div>
<div><br>
<div class="gmail_quote">
<div>2018年5月30日(水) 5:15 Igor Clark <<a
href="mailto:igor.clark@gmail.com" target="_blank"
moz-do-not-send="true">igor.clark@gmail.com</a>>:<br>
</div>
</div>
</div>
</div>
</div>
<div>
<div>
<div>
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">OK.
Thanks very much Scott. I've got all this working
using both those <br>
extra options, and it does seem to make the NIF crash
a lot sooner than <br>
previously, which is great. But I'm still only seeing
"process_main" in <br>
the crashed thread, so I'm not much closer to knowing
where the illegal <br>
access is. I wonder if it's in lots of places because
of what I'm doing <br>
with the callback and the thread. I hope not.<br>
<br>
I'll do some more digging, and tomorrow I'll try out a
debug emulator <br>
build as well.<br>
<br>
Thanks very much for helping me get this far!<br>
<br>
On 29/05/2018 16:31, Scott Ribe wrote:<br>
>> On May 29, 2018, at 9:16 AM, Igor Clark <<a
href="mailto:igor.clark@gmail.com" target="_blank"
moz-do-not-send="true">igor.clark@gmail.com</a>>
wrote:<br>
>><br>
>> So, do I have this right: the point of the
Guard Malloc is to make the crash happen at the time
of allocation, rather than delayed until something
trying to access it triggers the segfault; so if I get
a crash while running like this, I should be able to
just check in the Console debug log, and the stack
trace should show where the bug actually is?<br>
> At the time of the illegal access, not the
allocation. Yes, that's the point, you get a stack
trace showing you illegal access.<br>
><br>
> However, the BEAM allocator will reduce its
effectiveness. When you malloc in your C code, you get
a block set up such that accessing just past it (or
potentially before it) will cause an immediate crash.
When you free it, it's then set up such that accessing
will cause an immediate crash. But if you use Erlang's
allocation routines, Erlang may malloc a bigger block
with those protections, then hand out multiple
suballocations, and access beyond the end of one of
those can simply corrupt the next one without crashing
at that point.<br>
><br>
> You should also be using MallocScribble &
MallocPreScribble.<br>
><br>
><br>
><br>
<br>
_______________________________________________<br>
erlang-questions mailing list<br>
<a href="mailto:erlang-questions@erlang.org"
target="_blank" moz-do-not-send="true">erlang-questions@erlang.org</a><br>
<a
href="http://erlang.org/mailman/listinfo/erlang-questions"
rel="noreferrer" target="_blank"
moz-do-not-send="true">http://erlang.org/mailman/listinfo/erlang-questions</a><br>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</blockquote>
<br>
</body>
</html>