application:stop hangs

Ulf Wiger <>
Fri May 25 12:24:07 CEST 2001


On Thu, 24 May 2001, Chandrashekhar Mullaparthi wrote:

>After a bit more digging - there was no link between my worker process and
>the supervisor!! I was using proc_lib:spawn. I used proc_lib:spawn_link and
>everything works fine.
>
>I had assumed that the supervisor will link to all it's children - it seems
>to be the other way round. Why does the supervisor not link explicitly - it
>would be quite handy!

We've had discussions about this for quite some time. It's not
backward compatible, nor does it completely solve the problem, to
have the supervisor link to the child automatically. I *have*
seen at least one case where the child has explicitly called
unlink(Parent) (!). I was told that the reason had to do with
some special test method from the Erlang shell(*), and that they
had forgotten to take it out...


(*) Actually, if you start a gen_server using start_link from the
shell, you'd better explicitly unlink it -- if you mistype a
command and the shell process crashes (which happens pretty
often), your server will die as well.

The solution should be to have the supervisor rely on monitors
instead of links for the shutdown protocol.

I've patched supervisor.erl with the following code.
It seems to work for me.

/Uffe


shutdown(Pid, brutal_kill) ->
    unlink(Pid),
    case already_exited(Pid, killed) of
        false ->
            exit(Pid, kill),
            receive
                {'DOWN', MRef, process, Pid, killed} ->
                    ok;
                {'DOWN', MRef, process, Pid, OtherReason} ->
                    {error, OtherReason}
            end;
        Other ->        % ok | {error, ErrorReason}
            Other
    end;
shutdown(Pid, Time) ->
    unlink(Pid),
    case already_exited(Pid, shutdown) of
        false ->
            exit(Pid, shutdown),
            receive
                {'DOWN', MRef, process, Pid, shutdown} ->
                    ok;
                {'DOWN', MRef, process, Pid, OtherReason} ->
                    {error, OtherReason}
            after Time ->
                    exit(Pid, kill),  %% Force termination.
                    receive
                        {'DOWN', MRef, process, Pid, OtherReason} ->
                            {error, OtherReason}
                    end
            end;
        Other ->        % ok | {error, ErrorReason}
            Other
    end.


already_exited(Pid, ExpectedReason) ->
    receive
        {'DOWN', MRef, process, Pid, ExpectedReason} ->
            ok;
        {'DOWN', MRef, process, Pid, OtherReason} ->
            {error, OtherReason}
    after 0 ->
            false
    end.




-- 
Ulf Wiger                                    tfn: +46  8 719 81 95
Senior System Architect                      mob: +46 70 519 81 95
Strategic Product & System Management    ATM Multiservice Networks
Data Backbone & Optical Services Division      Ericsson Telecom AB




More information about the erlang-questions mailing list