[erlang-questions] Process surviving disconnect

Vincenzo Maggio maggio.vincenzo@REDACTED
Wed Aug 17 22:23:06 CEST 2011


Use monitor(process, PID_TO_BE_MONITORED) after spawn.
BTW not having two PCs I simply tested your code killing alice and guess what? bob survives! :)

> Date: Wed, 17 Aug 2011 14:13:37 -0600
> From: erlang@REDACTED
> To: maggio.vincenzo@REDACTED
> CC: erlang-questions@REDACTED
> Subject: RE: [erlang-questions] Process surviving disconnect
> 
>  Thank you for your quick answer.
> 
> > BTW, before filing a bug, could you please substitute spawn_link with
> > spawn_monitor and remove the process_flag lines? It would be
> > interesting to understand if either bob dies on its own or it's 
> > killed
> > by no more being able to communicate with alice.
> 
>  I am not sure how to replace spawn_link with spawn_monitor, as neither 
>  spawn_monitor/1 nor spawn_monitor/3 take a node parameter.
>  How do I do that or how else can I get some more detailed information 
>  about Bob's sudden passing the Styx.
> 
> 
> 
>  On Wed, 17 Aug 2011 21:48:30 +0200, Vincenzo Maggio wrote:
> > Hello,
> > I think something is wrong here.
> >>  Bob die of noconnection.
> >
> > This is printed by
> >>  		{'EXIT', Bob, Reason} ->
> >>  			io:format ("Bob died of ~p.~n", [Reason] ),
> >
> > So alice is in fact receiving bob last death cry :D and process_flag
> > translate it in a message instead of transmitting exit signal to
> > alice; I think this is ok from the point of view of Alice, so the 
> > real
> > problem is that bob is dying (I know it's mundane, but I learned not
> > to make assumption).
> > Mmm, well I don't know if having no more a connection between the
> > process makes Erlang VM do some assumption of a virtual master node.
> > Well, if you want my opinion, I think that you should file a bug on
> > the Erlang bugs mailing list if no one comes up with a proper
> > explanation.
> > Even if what we're thinking is wrong and this is not a bug, in the
> > past I had a problem on node lookup and they resolved it.
> > These are my two cents, but if you can please let me know if there
> > are further updates 'cause it's a really interesting problem.
> >
> > BTW, before filing a bug, could you please substitute spawn_link with
> > spawn_monitor and remove the process_flag lines? It would be
> > interesting to understand if either bob dies on its own or it's 
> > killed
> > by no more being able to communicate with alice.
> >
> > Vincenzo
> >
> >> Date: Wed, 17 Aug 2011 10:57:35 -0600
> >> From: erlang@REDACTED
> >> To: maggio.vincenzo@REDACTED
> >> CC: erlang-questions@REDACTED
> >> Subject: RE: [erlang-questions] Process surviving disconnect
> >>
> >>  Thank you very much Vincenzo. You affirmed my assertion that Bob 
> >> should
> >>  survive the disconnect.
> >>  Nevertheless he dies.
> >>  I will point out exactly what I do and maybe someone can spot the 
> >> error
> >>  in my code, my setup or my thinking and tell me what I am doing 
> >> wrong.
> >>
> >>  I start a node on gca.local:
> >>  unroot@REDACTED:~$ erl -name 'bob@REDACTED' -setcookie 123
> >>
> >>  I start a node on usa.local:
> >>  unroot@REDACTED:~$ erl -name 'alice@REDACTED' -setcookie 123
> >>
> >>  I start sasl on bob@REDACTED:
> >>  (bob@REDACTED)1> application:start (sasl).
> >>
> >>  I run alice:start/0 on alice@REDACTED:
> >>  (alice@REDACTED)1> alice:start ().
> >>  true
> >>
> >>  I look for bob on bob@REDACTED and save its pid:
> >>  (bob@REDACTED)2> whereis (bob).
> >>  <0.65.0>
> >>  (bob@REDACTED)3> Pid = whereis (bob).
> >>  <0.65.0>
> >>
> >>  I cut the network cable and wait a minute for the timeout.
> >>
> >>  On alice@REDACTED I get the following output:
> >>  =ERROR REPORT==== 17-Aug-2011::10:53:21 ===
> >>  ** Node 'bob@REDACTED' not responding **
> >>  ** Removing (timedout) connection **
> >>  Bob die of noconnection.
> >>
> >>  Nice, Alice trapped Bob's death and reported it. I check for Alice:
> >>  (alice@REDACTED)2> whereis (alice).
> >>  <0.42.0>
> >>
> >>  Alice is up and running.
> >>
> >>  On bob@REDACTED I get the following output:
> >>  =ERROR REPORT==== 17-Aug-2011::10:53:10 ===
> >>  ** Node 'alice@REDACTED' not responding **
> >>  ** Removing (timedout) connection **
> >>
> >>  But Bob is dead:
> >>  (bob@REDACTED)4> whereis (bob).
> >>  undefined
> >>  (bob@REDACTED)5> is_process_alive (Pid).
> >>  false
> >>
> >>  I really do not understand what is happening.
> >>
> >>
> >>  Thank you in advance
> >>
> >>  Anchise
> >>
> >>  Here goes the code I used:
> >>
> >>  -module (alice).
> >>  -compile (export_all).
> >>
> >>  start () -> register (alice, spawn (fun init/0) ).
> >>
> >>  stop () -> whereis (alice) ! stop.
> >>
> >>  init () ->
> >>  	process_flag (trap_exit, true),
> >>  	Bob = spawn_link ('bob@REDACTED', bob, start, [self () ] ),
> >>  	loop (Bob).
> >>
> >>  loop (Bob) ->
> >>  	receive
> >>  		stop -> ok;
> >>  		{'EXIT', Bob, Reason} ->
> >>  			io:format ("Bob died of ~p.~n", [Reason] ),
> >>  			loop (Bob);
> >>  		Msg ->
> >>  			io:format ("Alice received ~p.~n", [Msg] ),
> >>  			loop (Bob)
> >>  	end.
> >>
> >>
> >>  -module (bob).
> >>  -compile (export_all).
> >>
> >>  start (Alice) ->
> >>  	process_flag (trap_exit, true),
> >>  	register (bob, self () ),
> >>  	loop (Alice).
> >>
> >>  loop (Alice) ->
> >>  	receive
> >>  		stop -> ok;
> >>  		{'EXIT', Alice, Reason} ->
> >>  			io:format ("Alice died of ~p.~n", [Reason] ),
> >>  			loop (Alice);
> >>  		Msg ->
> >>  			io:format ("Bob received ~p.~n", [Msg] ),
> >>  			loop (Alice)
> >>  	end.
> >>
> >>
> >>  On Wed, 17 Aug 2011 12:47:02 +0200, Vincenzo Maggio wrote:
> >> > Hello,
> >> > without further info a debug is rather difficult.
> >> > But let's try to at least start analysis of the problem:
> >> >
> >> >>   - Has this something to do that I initially spawn Bob from the
> >> >> Alice
> >> >>  node?
> >> >
> >> > Absolutely not: this would hit the very foundation of Erlang, 
> >> process
> >> > referential transparency. When a process is started is a brand 
> >> new,
> >> > clean entity (indeed, default process heap space is always the 
> >> same
> >> > size!).
> >> >
> >> >>   - How can I make Bob to survive a connection loss?
> >> >
> >> > Look above: it SHOULD survive.
> >> >
> >> > Can you please start SASL (application:start(sasl) from the shell)
> >> > and see if shell log puts some further information?
> >> >
> >> > Vincenzo
> >>
> 
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110817/45c1ef9e/attachment.htm>


More information about the erlang-questions mailing list