[erlang-questions] Huge erl_crash.dump (2 gigs) - looking for advice

David Welton davidnwelton@REDACTED
Thu Jul 3 15:22:34 CEST 2014


The kind people in #erlang have given me some suggestions, but I'm
going to write here to appeal to a wider audience.  I've got a huge
erl_crash.dump, that's larger than 2 gigs, and I'm trying to figure
out anything I can about the crash, which came out of the blue.

Slogan: eheap_alloc: Cannot allocate 1080371408 bytes of memory (of
type "heap_frag").
System version: Erlang R16B03-1 (erts-5.10.4) [source] [64-bit]
[smp:4:4] [async-threads:10] [kernel-poll:true]
Compiled: Sun Mar 16 05:25:57 2014
Taints: crypto

Digging further, I found this:

=proc:<0.6.0>
State: Running
Name: error_logger
Spawned as: proc_lib:init_p/5
Last scheduled in for: gen_event:handle_msg/5
Spawned by: <0.2.0>
Started: Wed Jul  2 11:45:41 2014
Message queue length: 1
Number of heap fragments: 0
Heap fragment data: 0
Link list: [<0.0.0>, <0.98.0>, <0.32.0>]
Reductions: 19480050
Stack+heap: 137319567
OldHeap: 28690
Heap unused: 2273119
OldHeap unused: 2581
Memory: 1098789880
Program counter: 0x00007f1e51f3ed70 (gen_event:handle_msg/5 + 8)
CP: 0x0000000000000000 (invalid)

So that's what actually blew things up.  But how did it get all that memory?

=proc_dictionary:<0.6.0>
H7F1E4D062F68
H7F1E4D062F80
=proc_stack:<0.6.0>
0x00007f1dccc223a0:SReturn addr 0x5204B398 (gen_server:do_cast/2 + 128)
y0:H7F1DCBACA990
y1:AF:lager_crash_log
y2:SCatch 0x5204D188 (gen_server:do_send/2 + 112)
0x00007f1dccc223c0:SReturn addr 0x4F2936C8
(error_logger_lager_h:log_event/2 + 10328)
0x00007f1dccc223c8:SReturn addr 0x51F422F0 (gen_event:server_update/4 + 272)
y0:N
y1:N
y2:N
y3:N
y4:A8:emulator
y5:H7F1DCBACA8C8
y6:H7F1DCBACA888
y7:H7F1DCBACA930
0x00007f1dccc22410:SReturn addr 0x51F41ED0 (gen_event:server_notify/4 + 136)
y0:AC:error_logger
y1:H7F1DCBACA8F8
y2:H7F1D8B478038
y3:H7F1D8B478068
y4:A14:error_logger_lager_h
y5:SCatch 0x51F422F0 (gen_event:server_update/4 + 272)
0x00007f1dccc22448:SReturn addr 0x51F3EE68 (gen_event:handle_msg/5 + 256)
y0:AC:error_logger
y1:AC:handle_event
y2:H7F1DCBACA8F8
y3:N
0x00007f1dccc22470:SReturn addr 0x51F359B0 (proc_lib:init_p_do_apply/3 + 56)
y0:N
y1:AC:error_logger
y2:P<0.2.0>
0x00007f1dccc22490:SReturn addr 0x842688 (<terminate process normally>)
y0:SCatch 0x51F359D0 (proc_lib:init_p_do_apply/3 + 88)

The strack trace seems to indicate that it's trying to log something;
perhaps someone sent it a very  large message?  But I wonder where it
came from in the first place...

I tried using crashdump_viewer, but it chokes when I click on the
process and it tries to load up the enormous =proc_heap section:

=proc_heap:<0.6.0>
7F1DCBACA990:t2:A9:$gen_cast,H7F1DCBACA978
7F1DCBACA978:t2:A3:log,H7F1DCBACA8F8
7F1DCBACA8F8:t3:A5:error,A6:noproc,H7F1DCBACA8D8
7F1DCBACA8D8:t3:A8:emulator,H7F1DCBACA8C8,H7F1DCBACA888
7F1DCBACA888:lH7F1DCBACA878|N
7F1DCBACA878:lI39|H7F1DCBACA868
7F1DCBACA868:lI103|H7F1DCBACA858
7F1DCBACA858:lI115|H7F1DCBACA848
7F1DCBACA848:lI100|H7F1DCBACA838
7F1DCBACA838:lI95|H7F1DCBACA828
7F1DCBACA828:lI119|H7F1DCBACA818
7F1DCBACA818:lI101|H7F1DCBACA808
7F1DCBACA808:lI98|H7F1DCBACA7F8
7F1DCBACA7F8:lI64|H7F1DCBACA7E8
7F1DCBACA7E8:lI108|H7F1DCBACA7D8
7F1DCBACA7D8:lI111|H7F1DCBACA7C8
7F1DCBACA7C8:lI99|H7F1DCBACA7B8
7F1DCBACA7B8:lI97|H7F1DCBACA7A8
... and on and on for thousands of lines ...

davidw@REDACTED:~$ grep -n '^=proc_heap' erl_crash.dump
15835:=proc_heap:<0.0.0>
16133:=proc_heap:<0.3.0>
17424:=proc_heap:<0.6.0>
67540816:=proc_heap:<0.7.0>


Incidentally,  what *are* all those lines like

7F1DCBACA7F8:lI64|H7F1DCBACA7E8

anyway?

Is there any way to hack something up that will process those 67
million lines to tell me something useful about what's going on?

Other ideas about how to extract something meaningful about who
plopped this massive message in the logger?

Thank you
-- 
David N. Welton

http://www.welton.it/davidw/

http://www.dedasys.com/



More information about the erlang-questions mailing list