[erlang-questions] Process heap inspector

Michal Ptaszek michal.ptaszek@REDACTED
Mon Nov 28 12:02:32 CET 2011


Hey, 

good point, I knew I forgot about something - will add support for that
really soon. 

Kind regards,
Michal Ptaszek

On Nov 28, 2011, at 11:02 AM, Attila Rajmund Nohl wrote:

> Hello!
> 
> I really like the idea. But shouldn't this list include the message queue too?
> 
> 2011/11/28 Michal Ptaszek <michal.ptaszek@REDACTED>:
>> Hi everyone,
>> 
>> This idea was born in my mind when debugging some complex, live system
>> and trying to figure out where did all my memory go.
>> 
>> So, when debugging live system/investigating suspicious memory consumption patterns
>> or simply trying to understand better what's going on with our processes, it might be useful
>> to take a peep at the data given process operates on.
>> 
>> Right now it is possible to fetch internal gen_* processes state via sys:get_status, we can do
>> some tracing (even using DTrace), we can also check erlang:process_info output and analyze
>> it to become more or less familiar with what is the heap size of our suspect. Still, not all processes
>> are OTP-compatible, and even if: we are going to get only "alive" data coming from process' state
>> (not counting the outdated, not yet garbage collected terms). Also, process_info informs us only
>> about allocated size of the heap, not about the actual usage (although the pre-allocated chunks
>> are not available to the system, yet we might see how far we are from growing/shrinking it).
>> 
>> Enough with introduction, let's focus on the actual meat: my idea was to create a new BIF,
>> namely erlang:inspect_heap(Pid) that allows us to take a look at any process' heap, fetch the
>> terms residing there and check their actual size. So, for instance:
>> 
>>> (ejabberd@REDACTED)12> S = erlang:inspect_heap(pid(0, 358, 0)).
>>> [{[[<<"5">>]|
>>>    284735200226724471091958640173737944785062822211005333957298336375301959844499896296764925551414319236776784],
>>>   20},
>>>  {{'$internal_queue_len',0},3},
>>>  {{random_seed,{8236,26623,17360}},7},
>>>  {{'$ancestors',[ejabberd_c2s_sup,ejabberd_sup,<0.40.0>]},9},
>>>  {{'$initial_call',{gen,init_it,6}},7},
>>>  {{state,{socket_state,tls,
>>>                        {tlssock,#Port<0.3936>,#Port<0.3938>},
>>>                        <0.357.0>},
>>>          ejabberd_socket,#Ref<0.0.0.10120>,false,<<"2855118401">>,
>>>          {sasl_state,"jabber",<<"pvp.net">>,[],
>>>                      #Fun<ejabberd_c2s.0.67315917>,#Fun<ejabberd_c2s.1.67315917>,
>>>                      #Fun<ejabberd_c2s.2.67315917>,cyrsasl_digest,
>>>                      {state,5,<<"3598825873">>,
>>>                             {<<"dupa">>,<<...>>},
>>>                             <<>>,#Fun<ejabberd_c2s.0.67315917>,...}},
>>>          true,
>>>          {jid,<<"dupa">>,<<"pvp.net">>,<<"hubbard">>,<<"dupa">>,
>>>               <<"pvp.net">>,<<"hubbard">>},
>>>          <<"Nicknamedupa">>,
>>>          {{1322,217197,749816},<0.358.0>},
>>>          {1,{<<"dupa">>,nil,nil}},
>>>          {1,{<<"dupa">>,nil,nil}},
>>>          {1,{<<"dupa">>,nil,nil}},
>>>          {xmlelement,<<"presence">>,[],
>>>                      [{xmlcdata,<<...>>},{xmlelement,...},{...}|...]},
>>>          {userlist,none,[],false}},
>>>   564},
>>>  {{limits,undefined},3},
>>>  {{[],[]},3}]
>> 
>> gives us a pretty good knowledge on <0.358.0>:
>> • '$_' - OTP + gen_fsm2 process dictionary stuff, 3, 9, 7 words each
>> • random_seed - obvious
>> • {{limits,undefined},3} - internal limits for gen_fsm2 message queue, 3 words
>> • {[], []} - most probably leftovers after fetching user's presence lists
>> • {state, _} - gen_fsm2 state record - 564 words
>> • {[[<<"5">>]|, ...} - sequential tracing tokens? (I'm not very familiar with that, I would
>> say that's something from our rootset
>> 
>> The implementation is rather simple: if the process we probe is not the caller one (we are not doing
>> erlang:inspect_heap(self()), the data is copied from the callee heap to caller heap (to prevent from having
>> cross-process references in variables), then we compute flat size of the each term we moved. Also, rootset
>> is also included in the summary (i.e. process dict, seq tokens, etc.).
>> 
>> Code is included in my inspect_heap OTP branch on:
>>  github: https://github.com/paulgray/otp/tree/inspect_heap
>> 
>> I am still a little bit hesitant about suspending process we probe: can anyone tell
>> me if acquiring main process lock would be enough to keep its heap untouched during
>> the call?
>> 
>> Please, do point any bugs and tell me what do you think about the idea.
>> 
>> Best regards,
>> Michal Ptaszek
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>> 




More information about the erlang-questions mailing list