[erlang-questions] Process surviving disconnect
Vincenzo Maggio
maggio.vincenzo@REDACTED
Wed Aug 17 21:48:30 CEST 2011
Hello,
I think something is wrong here.
> Bob die of noconnection.
This is printed by
> {'EXIT', Bob, Reason} ->
> io:format ("Bob died of ~p.~n", [Reason] ),
So alice is in fact receiving bob last death cry :D and process_flag translate it in a message instead of transmitting exit signal to alice; I think this is ok from the point of view of Alice, so the real problem is that bob is dying (I know it's mundane, but I learned not to make assumption).
Mmm, well I don't know if having no more a connection between the process makes Erlang VM do some assumption of a virtual master node.
Well, if you want my opinion, I think that you should file a bug on the Erlang bugs mailing list if no one comes up with a proper explanation.
Even if what we're thinking is wrong and this is not a bug, in the past I had a problem on node lookup and they resolved it.
These are my two cents, but if you can please let me know if there are further updates 'cause it's a really interesting problem.
BTW, before filing a bug, could you please substitute spawn_link with spawn_monitor and remove the process_flag lines? It would be interesting to understand if either bob dies on its own or it's killed by no more being able to communicate with alice.
Vincenzo
> Date: Wed, 17 Aug 2011 10:57:35 -0600
> From: erlang@REDACTED
> To: maggio.vincenzo@REDACTED
> CC: erlang-questions@REDACTED
> Subject: RE: [erlang-questions] Process surviving disconnect
>
> Thank you very much Vincenzo. You affirmed my assertion that Bob should
> survive the disconnect.
> Nevertheless he dies.
> I will point out exactly what I do and maybe someone can spot the error
> in my code, my setup or my thinking and tell me what I am doing wrong.
>
> I start a node on gca.local:
> unroot@REDACTED:~$ erl -name 'bob@REDACTED' -setcookie 123
>
> I start a node on usa.local:
> unroot@REDACTED:~$ erl -name 'alice@REDACTED' -setcookie 123
>
> I start sasl on bob@REDACTED:
> (bob@REDACTED)1> application:start (sasl).
>
> I run alice:start/0 on alice@REDACTED:
> (alice@REDACTED)1> alice:start ().
> true
>
> I look for bob on bob@REDACTED and save its pid:
> (bob@REDACTED)2> whereis (bob).
> <0.65.0>
> (bob@REDACTED)3> Pid = whereis (bob).
> <0.65.0>
>
> I cut the network cable and wait a minute for the timeout.
>
> On alice@REDACTED I get the following output:
> =ERROR REPORT==== 17-Aug-2011::10:53:21 ===
> ** Node 'bob@REDACTED' not responding **
> ** Removing (timedout) connection **
> Bob die of noconnection.
>
> Nice, Alice trapped Bob's death and reported it. I check for Alice:
> (alice@REDACTED)2> whereis (alice).
> <0.42.0>
>
> Alice is up and running.
>
> On bob@REDACTED I get the following output:
> =ERROR REPORT==== 17-Aug-2011::10:53:10 ===
> ** Node 'alice@REDACTED' not responding **
> ** Removing (timedout) connection **
>
> But Bob is dead:
> (bob@REDACTED)4> whereis (bob).
> undefined
> (bob@REDACTED)5> is_process_alive (Pid).
> false
>
> I really do not understand what is happening.
>
>
> Thank you in advance
>
> Anchise
>
> Here goes the code I used:
>
> -module (alice).
> -compile (export_all).
>
> start () -> register (alice, spawn (fun init/0) ).
>
> stop () -> whereis (alice) ! stop.
>
> init () ->
> process_flag (trap_exit, true),
> Bob = spawn_link ('bob@REDACTED', bob, start, [self () ] ),
> loop (Bob).
>
> loop (Bob) ->
> receive
> stop -> ok;
> {'EXIT', Bob, Reason} ->
> io:format ("Bob died of ~p.~n", [Reason] ),
> loop (Bob);
> Msg ->
> io:format ("Alice received ~p.~n", [Msg] ),
> loop (Bob)
> end.
>
>
> -module (bob).
> -compile (export_all).
>
> start (Alice) ->
> process_flag (trap_exit, true),
> register (bob, self () ),
> loop (Alice).
>
> loop (Alice) ->
> receive
> stop -> ok;
> {'EXIT', Alice, Reason} ->
> io:format ("Alice died of ~p.~n", [Reason] ),
> loop (Alice);
> Msg ->
> io:format ("Bob received ~p.~n", [Msg] ),
> loop (Alice)
> end.
>
>
> On Wed, 17 Aug 2011 12:47:02 +0200, Vincenzo Maggio wrote:
> > Hello,
> > without further info a debug is rather difficult.
> > But let's try to at least start analysis of the problem:
> >
> >> - Has this something to do that I initially spawn Bob from the
> >> Alice
> >> node?
> >
> > Absolutely not: this would hit the very foundation of Erlang, process
> > referential transparency. When a process is started is a brand new,
> > clean entity (indeed, default process heap space is always the same
> > size!).
> >
> >> - How can I make Bob to survive a connection loss?
> >
> > Look above: it SHOULD survive.
> >
> > Can you please start SASL (application:start(sasl) from the shell)
> > and see if shell log puts some further information?
> >
> > Vincenzo
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110817/ee353714/attachment.htm>
More information about the erlang-questions
mailing list