[erlang-questions] Huge erl_crash.dump (2 gigs) - looking for advice

Jan Chochol jan.chochol@REDACTED
Thu Jul 3 16:34:47 CEST 2014


Hi David,

I think this is quite common pattern - some process wants to log some huge
details (but not fatally "large"), but error logger crashed, as it wants to
convert all data to strings (which can consume much more memory).
It happened to us several times.

proc_heap section contains dump of process heap. I am not exactly sure
about structure, but I think it is <memory address>:<type><type data>.
Part of your dump:

7F1DCBACA878:lI39|H7F1DCBACA868
7F1DCBACA868:lI103|H7F1DCBACA858
7F1DCBACA858:lI115|H7F1DCBACA848
7F1DCBACA848:lI100|H7F1DCBACA838
7F1DCBACA838:lI95|H7F1DCBACA828
7F1DCBACA828:lI119|H7F1DCBACA818
7F1DCBACA818:lI101|H7F1DCBACA808
7F1DCBACA808:lI98|H7F1DCBACA7F8
7F1DCBACA7F8:lI64|H7F1DCBACA7E8
7F1DCBACA7E8:lI108|H7F1DCBACA7D8
7F1DCBACA7D8:lI111|H7F1DCBACA7C8
7F1DCBACA7C8:lI99|H7F1DCBACA7B8
7F1DCBACA7B8:lI97|H7F1DCBACA7A8

Is part of some list. Type "l" is cons cell (basic construction item of
lists), which contains tuple <value>:<next item>. Value is of type "I"
(small integer - representing decimal character code), and next item is of
type "H" - pointer to next data.
Most simple way is to expect, that data forms list in order of memory cells
(which is quite often). You can e.g. use this simple Perl code to get some
idea about data (what error logger want to display):

$ perl -n -e 'print chr($1) if($_ =~ /:lI([0-9]*)\|/)' < part_of_crash_dump
'gsd_web@REDACTED

Where file "part_of_crash_dump" contains data from proc_heap section.
I am sure there are more sophisticated tools to analyze crash dumps, but
this can be used for the first sight.

Regards,
Jan


On Thu, Jul 3, 2014 at 4:08 PM, David Welton <davidnwelton@REDACTED> wrote:

> >> The kind people in #erlang have given me some suggestions, but I'm
> >> going to write here to appeal to a wider audience.  I've got a huge
> >> erl_crash.dump, that's larger than 2 gigs
> >
> > Assuming for a second that this doesn't contain actual dumps of data it
> > feels like it might contain a long list of _somethings_ . Can you say (by
> > looking at a few random points) what something is ?
>
> I'm not sure I follow...  I don't know what the proc_heap section
> contains - I don't know what all the lines in it mean.
>
> Robert writes:
> > one thing to be aware of when using sasl and error_logger is that
> process crashes get logged by sasl with the complete state of the process
> that just died, and error_logger tries to pretty print that state; this can
> take huge amounts of memory if the crashed process state is large.
>
> > I would recommend monitoring processes with high memory usage and
> figuring out the usage patterns. In general it is a good idea to try and
> limit the amount of state a process holds. But this is obviously
> application dependent.
>
> I don't think any of the processes grew in size so much that it was
> anywhere near the kind of size needed to trigger an out of memory
> error were its stated dumped and printed.  We'll keep an eye on them
> just in case one of them is stealthily growing, but for the moment we
> have not been able to reproduce the error.
>
> Thanks
> --
> David N. Welton
>
> http://www.welton.it/davidw/
>
> http://www.dedasys.com/
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140703/bb45989a/attachment.htm>


More information about the erlang-questions mailing list