[erlang-questions] Erlang's process not killed by exit signalreasoned by "kill"

Wed Oct 28 22:32:03 CET 2009

On Sat, Oct 24, 2009 at 9:10 PM, Robert Virding <rvirding@REDACTED> wrote:
> I just want to clear up one little point: there is in fact only one type of
> process, all processes behave in the same way with respect to dying and exit
> signals. So what Joe calls "system processes" are just processes started by
> the system, nothing more.

In this context "system process" means "a process that has evaluated
process_flag(trap_exit, true)"

>
> That exit(kill) and exit(Self, kill) behave differently is *exactly* as it
> should be, they are *defined* to behave differently. exit/2 will *always*
> send an exit signal to the process even if the process itself! Its behaviour
> is consistent. When sending to itself the process should behave as if the
> exit signal came from "outside" the itself. This means that if you are
> trapping exits you should also trap the exits from exit/2 to yourself.
>

Yes - in the book and elsewhere we refer to "signals" and "messages" - but they
are not the same thing. Signals get converted to messages and put into the
inbox of a process if you have said process_flag(tap_exit, true). But the signal
generated by exit(Pid, kill) *never* gets converted to a message.

When c dies by evaluating exit(X) it broadcasts a signal to it's link set
If  b is linked to c it will convert this to the message {'EXIT',Pid,X} this is
true for all X. If c says exit(Pic, kill) a *different kind* of
unstoppable signal
is sent and b must die. Signals are internal things and can never be
printed, we can only print the messages that result form signals being
converted to messages.

The origonal reason for this was to be able to kill rogue processes.
If A wants to
kill B it evaluates exit(B, kill) and we are guaranted that B will
die. If A just says
exit(kill) it says I'm dieing with reason kill.

The distinction between messages and signals is subtle.

If you read section 9.4 subsection "system processes" of my book you'll see this
behavior is described more or less as I have described it above.

Now if you didn't like the above description we can try a different approach

 ---- a different way of describing the same thing

The VM behaves as if there were 3 type of inter-process messages:

      (1, Data)  tuples represent regular messages
      (2, Data)  tuples represent signals
      3              is the kill signal

    P ! X sends a type 1 message  (1,X)
    exit(X) sends  a type 2 message (2,X)
    exit(P, kill) sends a type 3 message
    exit(P, X) sends a type 2 message (2,X)

on receiving a type 2 message if X != normal a normal process dies
on receiving a type 2 message a system process converts this to a (1,X) message
on receiving a type 3 message the receiving process dies unconditionally

--- end

This is reasonably clear - the *problem* is that to describe the
behavior I've had to
suddenly invent an new abstract machine and you have to guess the semantics
of this machine. That's why we try very hard to use terms like signals
and messages
to describe the behavior in an abstract way and not in terms of a VM - because
the description of the VM itself would be pretty large and for most purposes is
not necessary. (( and anyway if I describe the VM in language in L1,
what do I describe
L1 in? L2? and what do I describe L2 in .....)) To make life simple we
try to use
meta-circular descriptions and describe Erlang behavior in terms of
Erlang. Mostly
this works - but for the case of signals and messages is not easy.

 > Oth an exit/1 works "internally". It should behave in the same way as
> erlang:error/1. In fact originally they were the same and there was only
> exit/1. Internal errors also looked the same as exit/1. Things started to
> drift apart when internal errors got stack traces. Now stack traces are a
> Good Thing but it meant that the behaviour of errors and exit/1 diverged.
> Unfortunately, instead of fixing exit/1, which would have been the most
> sensible thing to do, the split was made permanent by adding error/1.
>
> So exit/1 and error/1 should behave in the same way, which is different to
> exit/2.
>
> 'kill' is a little special as we felt we needed something which wasn't
> trappable but at the same time we realised that if it spread in the same way
> as other exit signals then it would be uncontrollable. This is why it was
> decided that when a process received an exit 'kill' signal it should always
> unconditionally die but only resignal 'killed' to its linked processes.
> Whether doing exit(kill) should send a real exit 'kill' signal or only a
> 'killed' signal is an interesting question which I can't remember now what
> we decided. If it sends an exit 'kill' signal then it should behave as a
> real 'kill' signal and be non-trappable.
>
> This definitely something for the Erlang Rationale!

Yup

/Joe

>
> Robert
>
> 2009/10/24 Yan Yan <yan.beijing.china@REDACTED>
>
>> Thanks, Richard
>>
>> In fact, I had a very similiar thought with yours as I found b killed when
>> b is not a system process but b not killed when b is a system process.
>>
>> However, even if exit(kill) acts the same as exit(self(), kill) does, there
>> seems to be problems left. For example, if, in this setup, c does want ALL
>> linked processes to be killed when c itself exits abnormally, we will want a
>> simple code in c() just like exit(kill). In this way, we want all processes
>> linked to c will be killed because of c's exit, even if there were any
>> system processes among c's linked processes. Currently, if we write
>> exit(kill) or exit(self(), kill) in c(), only c and c's linked non-system
>> processes will be killed, but all system processed linked will survive,
>> which is not correct.
>>
>> So far, there seems to be no other ways to let it happen correctly except:
>>
>> Code:
>>
>> c(......) ->
>> blabla...
>>
>> %% P1, P2, ..., Pn are linked SYSTEM processes
>> exit(P1, kill),
>> exit(P2, kill),
>> ...
>> exit(Pn, kill),
>>
>> %%Here c is still alive, and we want c and its linked non-system processes
>> also killed, so...
>> exit(kill). %% or exit(self(), kill), or exit(anything)..
>>
>> End code.
>>
>> However, it still demands that programmers know ahead of time how many and
>> which system processes are linked to c, which is not a good design pattern.
>>
>> Sincerely,
>>
>> Yan Yan
>>
>>
>>
>>
>> Yan Yan
>> 2009-10-24
>>
>>
>>
>> From: Richard Carlsson
>> Time:  2009-10-24 21:05:05
>> To: Yan Yan
>> Fw: erlang-questions; erlang-bugs; Bj鰎n_Gustavsson
>> Subject: Re: [erlang-questions] Erlang's process not killed by exit
>> signalreasoned by "kill"
>>
>> Yan Yan wrote:
>> > (2)On Page 169,
>> (Page 161 in my copy of the book.)
>> > Quote:
>> >
>> > 8> edemo1:start(true, {die,kill}).  Process b received
>> > {'EXIT',<0.73.0>,kill} process b (<0.72.0>) is alive process c
>> > (<0.73.0>) is dead ok
>> >
>> > End quote.
>> >
>> > Here a and b are both system processes, while c is not. When c exits
>> > with the reason "kill" (not "killed"), it sends exit signal to b with
>> > the reason "kill". Therefore b should be killed and dead, but b is
>> > still alive here!
>> >
>> > (I had thought it was only a small typo in the book. But then I
>> > tested by myself and got the same result: b received the exit signal
>> > with the reason "kill" and b was not killed but alive.)
>> >
>> > Why is the system process b not killed by the exit signal with the
>> > reason "kill"?
>> Interesting. There seems to be a difference in behaviour (probably
>> not intentional) between `exit(kill)' and `exit(self(),kill)'.
>> A summary of the setup: we have three processes, linked in a chain.
>> 'a' is always trapping, 'b' may or may not be trapping, and 'c' is
>> the one who dies by calling exit(Reason).
>>  1. When 'b' is not trapping, and 'c' does exit(kill), we see a
>>     report from 'a' that 'b' died with reason 'killed'.
>>  2. When 'b' is trapping, and 'c' does exit(kill), 'b' survives
>>     and reports that it sees the exit reason 'kill' from 'c'
>>     (not 'killed').
>> The question is: in case 2, why didn't 'b' die, when it apparently
>> got a 'kill' message. This should be untrappable, which is why it is
>> changed to 'killed' when it is propagated. But it was 'c' who died,
>> so why wasn't the atom changed to 'killed'?
>> If we change `exit(kill)' to `exit(self(),kill)' in 'c', we get the
>> effect we expected: 'b' survives and reports that it sees 'killed'
>> as the exit reason from 'c'.
>> Then, a new question is why 'a' saw 'killed' in case 1 when 'b' is
>> non-trapping. If 'c' dies in the same way in both, then doesn't
>> 'b' get the same signal from 'c'? Apparently, it does, but since
>> 'b' is not trapping, it doesn't matter what the atom is as long
>> as it is something else than 'normal'. So 'b' dies due to an
>> incoming 'kill', and this is then propagated as 'killed'.
>> It seems that when a process does exit(kill) on itself, it causes
>> a different outgoing signal than if it does exit(self(),kill).
>> The former is not untrappable even though it has the reason 'kill',
>> but if that in its turn causes another process to die, the atom
>> will _then_ be rewritten to 'killed'.
>> To me, it seems that this should be fixed so that exit(kill), even
>> if it's an unusual case, should be propagated as 'killed'.
>>     /Richard
>>
>