[erlang-questions] Process heap inspector

Tue Nov 29 19:56:11 CET 2011

On 11-29 08:28, Michal Ptaszek wrote:
> 
> >> Bit confused, but wouldn't this objection also apply to
> >> erlang:suspend_process/2 [1] as well? I use this quite often in
> >> production on long lived processes that are chewing up resources. Its
> >> quite the handy tool in certain cases.
> >> 
> >> [1] http://erlang.org/doc/man/erlang.html#suspend_process-2
> >> 
> > 
> > I think problem with such feature, is that it break soft-realtimenes
> > and preemptibility of all erlang processes. By creating and calling such BIF
> > you are essentially makeing impossible to schedule other processes,
> > if you have single scheduler and single CPU.
> > Most long running BIFs run in separate async threads or are done
> > in such way that one can stop them in any reasonable point,
> > and continue later, this way long running BIF is broken
> > into some (maybe large) incremental steps, which one bringing
> > you closer to result, but at each transition you can choice
> > to perform step or go back to scheduler (due reductions exhaustion),
> > and be scheduled later to continue this steps...
> > 
> > This is for example situation in re module (regular expression) BIFs,
> > or even simple one like length/1.
> > 
> > So unless such BIF is written in preemptible way, it should not be included
> > in the non-debug build.
> 
> I'm sorry, but I disagree. In this case one process operates on the heap
> of another process - if we let caller to be preempted we have two ways to
> go:
> a) resume callee and risk GC/storing new terms on the heap

I was saying that it affects all OTHER processes.  We should not resume
target processes, exactly to prevent storing anything on heap or running
GC on him. But we should be able to schedule OTHER processes, which are
completly independed of both caller and callee.

> b) leave callee suspended and risk caller to be terminated (I assume that
> we can be e.g. killed by exit(Pid, kill) BIF by any other process in the system?). 
> If so - callee will never be brought back to life again and remain suspended 
> for good. Plus, if we allow caller process to be preempted, time required for
> callee to be awaken grows again.

How about fixing exit(Pid, kill) BIF?. It needs to perform cleanup anyway (like
processes linked processes), and if we are locking there is actually no
way to kill process which is doing heap inspection. We we are copying in
incremenetal way it is simple. Already copied data will be easly
cleaned/deallocated up by existing memory cleanup procedure and GC.

We just should not allow target process to be scheduled or running, if
any other process is performing heap inspection on him.

We however should allow all other process to continue operation, as long
they do not communicate with both processes or ar not calling our BIF in
the same time.

Only message queue will then be affected, but we can solve this in two possible ways:

1) block senders when trying to queue a messagesd into recieving process
  - not really and option, it break asynchronous sending, as well
    is essentially impossible in distributed erland
2) copy messae queue and ignore new messages queued after coping strted.
3) do not copy at all (leave it to process_info(Pid, messages) BIF.

In fact solution 3) (that is process_info(Pid, messages)), should use solution 2).

In fact we do not even need to fully lock queue. If we blocked
scheduling of target process, then it cannot dequeue anything from its
message queue, and we should be happy by just returning how queue was
looking in the moment we started copying (starting from head to curent
end element) If someone queues new message into targets message queue,
it can be done without problem as it will done only at the end of queue,
and it is secured by separate lock AFAIK. We can just stop copying
message queue when we hit saved end element, or then retry locking, and
copy one more time new elements (but not again, as it indeeded could
trigger copying to never end). This way it is all predictable and
doesn't affect other processes as well ends in finite time.

> Also, if we take a look at the debug process_info(Pid, messages) - it also
> does not implement any kind of process interleaving, even if the message 
> queue is extremely large (and on non-SMP VMs can take a while). 

I think it can be improved. Because even in SMP build, what will happen
if multiple processes will try to perform process_info(Pid, messages) of
multiple other processes? It will be all scheduled on single
predetermined processes (which will make other processes to schedule
without problems using other CPUs, but may introduce deadlock situation,
but I'm not sure about it, how serialization is done in this scenario),
or each on separate CPUs (which will make same probelem if we have not
enough CPUs).

In fact if process_info(Pid, messages) would be improved as presented in
previous paragraph, it could be used by heap_inspector, as it is not
returning message queue anyway currently.

> 
> As it was pointed out debug tools are used for debugging, and thus should be 
> operated with knowledge on what are the consequences of the call. We have 
> init:stop/0 in the API, but no one complains that someone might apply it by 
> accident on the live system. 

Everybody know what init:stop/0 is doing, and it is obvious what it will
do.

IMHO it is possible to write such heap inspector functionality in
correct way, which will make it possible to run also on live system
correctly. Especially that not everybody will have knowledge that usage
of such debuging function may affect whole system. Also inexpirienced
developer may be temped to use it because he/she belive that target
process have small heap and it will finish quickly. But there is no way
to know this for sure in advance easly (one will essentially need to
check various informations like total_heap_size, message_queue_len,
returned by process_info(Pid, ItemSpec)), etc.

In fact documentation of erlang:process_info/2 doesn't mention at all
that it breaks Erlang promisses of scheduling. There is note for
erlang:process_info/1, that it should only be used for debugging, but
there is nothing about erlang:process_info/2, even calling for returning
messages.

"""
  Warning:
            This  BIF  is intended for debugging only, use process_info/2
              for all other purposes.
"""

So this need fix, as it clearly give me impression that calling
process_info(Pid, messages | links | dictionary) is safe in some sense
(I only mention this try ItemSpec items, because rest are probably
computable in constant time, so will be neglibegle impact even if they
use full process locking, or even full VM locking).

Existence of badly implemented BIFs is not a justification for creating
more badly implemented BIFs. I'm not against helpful BIFs (well,
actually in some sense I'm, because it adds to the VM complexity
substantially), especially outside of OTP, but if one wants better
integration and official support, IMHO they should be implemented
properly.

And it is not hard, you just need to have explicit stack which will be
used for knowing when in process of coping we are. For copying single
term there is already functionality in VM (used when copying message to
another process' message queue) and it is already probably written in
correct way (modulo next paragraph which is in practice of small
importance).

Other interesting problem is what with terms constructed by process like
this:

  A = {"just small list"},
  B = {A,A,A,A,A,A,A,A,A,A,A,A,A,A,A,A},
  C = {B,B,B,B,B,B,B,B,B,B,B,B,B,B,B,B},
  D = {C,C,C,C,C,C,C,C,C,C,C,C,C,C,C,C},
  E = {D,D,D,D,D,D,D,D,D,D,D,D,D,D,D,D},
  F = {E,E,E,E,E,E,E,E,E,E,E,E,E,E,E,E},
  G = {F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F},
  337781283 = size(term_to_binary(G)).  % it actually crash dumps my beam, with eheap_alloc failure.

It can be disaster when trying to copy them to other process
in normal way (using existing copy term functionality).
(it is possible to copy them safely, maybe even quite fast
when doing this in C, because in fact it is possible even in pure
erlang: http://erlang.org/pipermail/erlang-questions/2009-September/046452.html )
And even checkinfo process_info(Pid, heap_size) will not
help developer to know if it is safe to perform heap inspection.
Such terms doesn't occur normally, and developers will know
when to expect them. Other possibility is to have additional
option to heap inspector which will allow seeing only part of heap
(you will not inspect 300MB of heap manually anyway, so it is
in most cases pointless to copy it all).

Maybe there is something I missed which make it impossible to write
heap inspector properly at all? If yes, then well, I will
accept reality. I just do not want to make Erlang situation
with BIFs worse than it is currently if possible.

Regards,
Witek

> 
> On Nov 28, 2011, at 2:55 PM, Kostis Sagonas wrote:
> > I am also concerned about how/whether sharing of subterms is preserved or not when doing the copying. (Based on the phrasing that "then we compute flat size of the each term we moved", I suspect the answer is no.)  Why is this useful?  You may end up with an arbitrarily bigger heap in the caller than the one that the callee currently has. Call me unimaginative but I do not really see why you would want that...
> 
> Right, that's yet another thing I must work on: thank you for the hint!
> 
> Kind regards,
> Michal Ptaszek

-- 
Witold Baryluk
JID: witold.baryluk // jabster.pl