[erlang-questions] Process heap inspector

Michal Ptaszek michal.ptaszek@REDACTED
Mon Nov 28 08:39:53 CET 2011


Hi everyone, 

This idea was born in my mind when debugging some complex, live system
and trying to figure out where did all my memory go. 

So, when debugging live system/investigating suspicious memory consumption patterns
or simply trying to understand better what's going on with our processes, it might be useful 
to take a peep at the data given process operates on. 

Right now it is possible to fetch internal gen_* processes state via sys:get_status, we can do 
some tracing (even using DTrace), we can also check erlang:process_info output and analyze 
it to become more or less familiar with what is the heap size of our suspect. Still, not all processes 
are OTP-compatible, and even if: we are going to get only "alive" data coming from process' state 
(not counting the outdated, not yet garbage collected terms). Also, process_info informs us only 
about allocated size of the heap, not about the actual usage (although the pre-allocated chunks 
are not available to the system, yet we might see how far we are from growing/shrinking it).

Enough with introduction, let's focus on the actual meat: my idea was to create a new BIF, 
namely erlang:inspect_heap(Pid) that allows us to take a look at any process' heap, fetch the 
terms residing there and check their actual size. So, for instance:

> (ejabberd@REDACTED)12> S = erlang:inspect_heap(pid(0, 358, 0)).
> [{[[<<"5">>]|
>    284735200226724471091958640173737944785062822211005333957298336375301959844499896296764925551414319236776784],
>   20},
>  {{'$internal_queue_len',0},3},
>  {{random_seed,{8236,26623,17360}},7},
>  {{'$ancestors',[ejabberd_c2s_sup,ejabberd_sup,<0.40.0>]},9},
>  {{'$initial_call',{gen,init_it,6}},7},
>  {{state,{socket_state,tls,
>                        {tlssock,#Port<0.3936>,#Port<0.3938>},
>                        <0.357.0>},
>          ejabberd_socket,#Ref<0.0.0.10120>,false,<<"2855118401">>,
>          {sasl_state,"jabber",<<"pvp.net">>,[],
>                      #Fun<ejabberd_c2s.0.67315917>,#Fun<ejabberd_c2s.1.67315917>,
>                      #Fun<ejabberd_c2s.2.67315917>,cyrsasl_digest,
>                      {state,5,<<"3598825873">>,
>                             {<<"dupa">>,<<...>>},
>                             <<>>,#Fun<ejabberd_c2s.0.67315917>,...}},
>          true,
>          {jid,<<"dupa">>,<<"pvp.net">>,<<"hubbard">>,<<"dupa">>,
>               <<"pvp.net">>,<<"hubbard">>},
>          <<"Nicknamedupa">>,
>          {{1322,217197,749816},<0.358.0>},
>          {1,{<<"dupa">>,nil,nil}},
>          {1,{<<"dupa">>,nil,nil}},
>          {1,{<<"dupa">>,nil,nil}},
>          {xmlelement,<<"presence">>,[],
>                      [{xmlcdata,<<...>>},{xmlelement,...},{...}|...]},
>          {userlist,none,[],false}},
>   564},
>  {{limits,undefined},3},
>  {{[],[]},3}]

gives us a pretty good knowledge on <0.358.0>:
• '$_' - OTP + gen_fsm2 process dictionary stuff, 3, 9, 7 words each
• random_seed - obvious
• {{limits,undefined},3} - internal limits for gen_fsm2 message queue, 3 words
• {[], []} - most probably leftovers after fetching user's presence lists
• {state, _} - gen_fsm2 state record - 564 words
• {[[<<"5">>]|, ...} - sequential tracing tokens? (I'm not very familiar with that, I would 
say that's something from our rootset

The implementation is rather simple: if the process we probe is not the caller one (we are not doing 
erlang:inspect_heap(self()), the data is copied from the callee heap to caller heap (to prevent from having 
cross-process references in variables), then we compute flat size of the each term we moved. Also, rootset 
is also included in the summary (i.e. process dict, seq tokens, etc.).

Code is included in my inspect_heap OTP branch on:
 github: https://github.com/paulgray/otp/tree/inspect_heap

I am still a little bit hesitant about suspending process we probe: can anyone tell
me if acquiring main process lock would be enough to keep its heap untouched during
the call?

Please, do point any bugs and tell me what do you think about the idea. 

Best regards,
Michal Ptaszek


More information about the erlang-questions mailing list