[erlang-questions] Suspending Erlang Processes

Rickard Green rickard@REDACTED
Tue Oct 1 22:06:39 CEST 2019


On Mon, Sep 30, 2019 at 1:57 PM Duncan Paul Attard <
duncan.attard.01@REDACTED> wrote:
>
> I am tracing an Erlang process, say, `P` by invoking the BIF
`erlang:trace(Pid_P, true, [set_on_spawn, procs, send, 'receive'])` from
some process. As per the Erlang docs, the latter process becomes the tracer
for `P`, which I shall call `Trc_Q`.
>
> Suppose now, that process `P` spawns a new process `Q`. Since the flag
`set_on_spawn` was specified in the call to `erlang:trace/3` above, `Q`
will automatically be traced by `Trc_P` as well.
>
> ---
>
> I want to spawn a **new** tracer, `Trc_Q`, and transfer the ownership of
tracing `Q` to it, so that the resulting configuration will be that of
process `P` being traced by tracer `Trc_P`, `Q` by `Trc_Q`.
>

Unfortunately I do not have any ideas on how to accomplish this.

> However, Erlang permits **at most** one tracer per process, so I cannot
achieve said configuration by invoking `erlang:trace(Pid_Q, true, ..)` from
`Trc_Q`. The only way possible is to do it in two steps:
>
> 1. Tracer `Trc_Q` calls `erlang:trace(Pid_Q, false, ..)` to stop `Trc_P`
from tracing `Q`;
> 2. `Trc_Q` calls `erlang:trace(Pid_Q, true, ..)` again to start tracing
`Q`.
>
> In the time span between steps **1.** and **2.** above, it might be
possible that trace events by process `Q` are **lost** because at that
moment, there is no tracer attached. One way of mitigating this is to
perform the following:
>
> 1. Suspend process `Q` by calling `erlang:suspend_process(Pid_Q)` from
`Trc_Q` (note that as per Erlang docs, `Trc_Q` remains blocked until `Q` is
eventually suspended by the VM);
> 2. `Trc_Q` calls `erlang:trace(Pid_Q, false, ..)` to stop `Trc_P` from
tracing `Q`;
> 3. `Trc_Q` calls `erlang:trace(Pid_Q, true, ..)` again to start tracing
`Q`;
> 4. Finally, `Trc_Q` calls `erlang:resume_process(Pid_Q)` so that `Q` can
continue executing.
>
> From what I was able to find out, while `Q` is suspended, messages sent
to it are queued, and when resumed, `Trc_Q` receives the `{trace, Pid_Q,
receive, Msg}` trace events accordingly without any loss.
>

This is not a feature, it is a bug (introduced in erts 10.0, OTP 21.0) that
will be fixed. The trace message should have been delivered even though the
receiver was suspended.

You cannot even rely on this behavior while this bug is present. If you (or
any process in the system) send the suspended process a non-message signal
(monitor, demonitor, link, unlink, exit, process_info, ...), the bug will
be bypassed and the trace message will be delivered.

> However, I am hesitant to use suspend/resume, since the Erlang docs
explicitly say that these are to be used for *debugging purposes only*.

Mission accomplished! :-)

> Any idea as to why this is the case?
>

The language was designed with other communication primitives intended for
use. Suspend/Resume was explicitly introduced for debugging purposes only,
and not for usage by ordinary Erlang programs. They will most likely not
disappear, but debug functionality in general are not treated as carefully
by us at OTP as other ordinary functionality with regards to compatibility,
etc. We for example removed the automatic deadlock prevention in
suspend_process() that existed prior to erts 10.0 due to performance
reasons.

Regards,
Rickard
--
Rickard Green, Erlang/OTP, Ericsson AB
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20191001/79c33a10/attachment.htm>


More information about the erlang-questions mailing list