[erlang-questions] Process heap inspector

Mon Nov 28 16:23:26 CET 2011

On Mon, Nov 28, 2011 at 7:55 AM, Kostis Sagonas <kostis@REDACTED> wrote:
> On 11/28/2011 08:39 AM, Michal Ptaszek wrote:
>>
>> Hi everyone,
>>
>> This idea was born in my mind when debugging some complex, live system
>> and trying to figure out where did all my memory go.
>>
>> So, when debugging live system/investigating suspicious memory consumption
>> patterns
>> or simply trying to understand better what's going on with our processes,
>> it might be useful
>> to take a peep at the data given process operates on.
>>
>> ...
>>
>> The implementation is rather simple: if the process we probe is not the
>> caller one (we are not doing
>> erlang:inspect_heap(self()), the data is copied from the callee heap to
>> caller heap (to prevent from having
>> cross-process references in variables), then we compute flat size of the
>> each term we moved. Also, rootset
>> is also included in the summary (i.e. process dict, seq tokens, etc.).
>>
>> Code is included in my inspect_heap OTP branch on:
>>  github: https://github.com/paulgray/otp/tree/inspect_heap
>>
>> I am still a little bit hesitant about suspending process we probe: can
>> anyone tell
>> me if acquiring main process lock would be enough to keep its heap
>> untouched during
>> the call?
>>
>> Please, do point any bugs and tell me what do you think about the idea.
>
> I can see that this may be handy to have at some situations, but provided I
> understand what is happening at the implementation level (disclaimer: I have
> not looked at the implementation), I think it's actually a pretty bad idea
> to include in a non debug-enabled runtime system.
>
> The reason is that this breaks all assumptions/invariants of the runtime
> system in that Erlang processes are independent and can be scheduled to
> execute concurrently on an SMP without being preempted by anything other
> than exhausting their reduction step count or being stuck on some receive.
> With this "built-in feature" processes need to be able to stop at more or
> less any random point and stay suspended for an indefinite amount of time
> based on code that _another_ process is executing.
>

Bit confused, but wouldn't this objection also apply to
erlang:suspend_process/2 [1] as well? I use this quite often in
production on long lived processes that are chewing up resources. Its
quite the handy tool in certain cases.

[1] http://erlang.org/doc/man/erlang.html#suspend_process-2

> I am also concerned about how/whether sharing of subterms is preserved or
> not when doing the copying. (Based on the phrasing that "then we compute
> flat size of the each term we moved", I suspect the answer is no.)  Why is
> this useful?  You may end up with an arbitrarily bigger heap in the caller
> than the one that the callee currently has. Call me unimaginative but I do
> not really see why you would want that...
>
> Kostis
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>