[erlang-questions] Process surviving disconnect

Anchise Inzaghi erlang@REDACTED
Wed Aug 17 22:13:37 CEST 2011


 Thank you for your quick answer.

> BTW, before filing a bug, could you please substitute spawn_link with
> spawn_monitor and remove the process_flag lines? It would be
> interesting to understand if either bob dies on its own or it's 
> killed
> by no more being able to communicate with alice.

 I am not sure how to replace spawn_link with spawn_monitor, as neither 
 spawn_monitor/1 nor spawn_monitor/3 take a node parameter.
 How do I do that or how else can I get some more detailed information 
 about Bob's sudden passing the Styx.



 On Wed, 17 Aug 2011 21:48:30 +0200, Vincenzo Maggio wrote:
> Hello,
> I think something is wrong here.
>>  Bob die of noconnection.
>
> This is printed by
>>  		{'EXIT', Bob, Reason} ->
>>  			io:format ("Bob died of ~p.~n", [Reason] ),
>
> So alice is in fact receiving bob last death cry :D and process_flag
> translate it in a message instead of transmitting exit signal to
> alice; I think this is ok from the point of view of Alice, so the 
> real
> problem is that bob is dying (I know it's mundane, but I learned not
> to make assumption).
> Mmm, well I don't know if having no more a connection between the
> process makes Erlang VM do some assumption of a virtual master node.
> Well, if you want my opinion, I think that you should file a bug on
> the Erlang bugs mailing list if no one comes up with a proper
> explanation.
> Even if what we're thinking is wrong and this is not a bug, in the
> past I had a problem on node lookup and they resolved it.
> These are my two cents, but if you can please let me know if there
> are further updates 'cause it's a really interesting problem.
>
> BTW, before filing a bug, could you please substitute spawn_link with
> spawn_monitor and remove the process_flag lines? It would be
> interesting to understand if either bob dies on its own or it's 
> killed
> by no more being able to communicate with alice.
>
> Vincenzo
>
>> Date: Wed, 17 Aug 2011 10:57:35 -0600
>> From: erlang@REDACTED
>> To: maggio.vincenzo@REDACTED
>> CC: erlang-questions@REDACTED
>> Subject: RE: [erlang-questions] Process surviving disconnect
>>
>>  Thank you very much Vincenzo. You affirmed my assertion that Bob 
>> should
>>  survive the disconnect.
>>  Nevertheless he dies.
>>  I will point out exactly what I do and maybe someone can spot the 
>> error
>>  in my code, my setup or my thinking and tell me what I am doing 
>> wrong.
>>
>>  I start a node on gca.local:
>>  unroot@REDACTED:~$ erl -name 'bob@REDACTED' -setcookie 123
>>
>>  I start a node on usa.local:
>>  unroot@REDACTED:~$ erl -name 'alice@REDACTED' -setcookie 123
>>
>>  I start sasl on bob@REDACTED:
>>  (bob@REDACTED)1> application:start (sasl).
>>
>>  I run alice:start/0 on alice@REDACTED:
>>  (alice@REDACTED)1> alice:start ().
>>  true
>>
>>  I look for bob on bob@REDACTED and save its pid:
>>  (bob@REDACTED)2> whereis (bob).
>>  <0.65.0>
>>  (bob@REDACTED)3> Pid = whereis (bob).
>>  <0.65.0>
>>
>>  I cut the network cable and wait a minute for the timeout.
>>
>>  On alice@REDACTED I get the following output:
>>  =ERROR REPORT==== 17-Aug-2011::10:53:21 ===
>>  ** Node 'bob@REDACTED' not responding **
>>  ** Removing (timedout) connection **
>>  Bob die of noconnection.
>>
>>  Nice, Alice trapped Bob's death and reported it. I check for Alice:
>>  (alice@REDACTED)2> whereis (alice).
>>  <0.42.0>
>>
>>  Alice is up and running.
>>
>>  On bob@REDACTED I get the following output:
>>  =ERROR REPORT==== 17-Aug-2011::10:53:10 ===
>>  ** Node 'alice@REDACTED' not responding **
>>  ** Removing (timedout) connection **
>>
>>  But Bob is dead:
>>  (bob@REDACTED)4> whereis (bob).
>>  undefined
>>  (bob@REDACTED)5> is_process_alive (Pid).
>>  false
>>
>>  I really do not understand what is happening.
>>
>>
>>  Thank you in advance
>>
>>  Anchise
>>
>>  Here goes the code I used:
>>
>>  -module (alice).
>>  -compile (export_all).
>>
>>  start () -> register (alice, spawn (fun init/0) ).
>>
>>  stop () -> whereis (alice) ! stop.
>>
>>  init () ->
>>  	process_flag (trap_exit, true),
>>  	Bob = spawn_link ('bob@REDACTED', bob, start, [self () ] ),
>>  	loop (Bob).
>>
>>  loop (Bob) ->
>>  	receive
>>  		stop -> ok;
>>  		{'EXIT', Bob, Reason} ->
>>  			io:format ("Bob died of ~p.~n", [Reason] ),
>>  			loop (Bob);
>>  		Msg ->
>>  			io:format ("Alice received ~p.~n", [Msg] ),
>>  			loop (Bob)
>>  	end.
>>
>>
>>  -module (bob).
>>  -compile (export_all).
>>
>>  start (Alice) ->
>>  	process_flag (trap_exit, true),
>>  	register (bob, self () ),
>>  	loop (Alice).
>>
>>  loop (Alice) ->
>>  	receive
>>  		stop -> ok;
>>  		{'EXIT', Alice, Reason} ->
>>  			io:format ("Alice died of ~p.~n", [Reason] ),
>>  			loop (Alice);
>>  		Msg ->
>>  			io:format ("Bob received ~p.~n", [Msg] ),
>>  			loop (Alice)
>>  	end.
>>
>>
>>  On Wed, 17 Aug 2011 12:47:02 +0200, Vincenzo Maggio wrote:
>> > Hello,
>> > without further info a debug is rather difficult.
>> > But let's try to at least start analysis of the problem:
>> >
>> >>   - Has this something to do that I initially spawn Bob from the
>> >> Alice
>> >>  node?
>> >
>> > Absolutely not: this would hit the very foundation of Erlang, 
>> process
>> > referential transparency. When a process is started is a brand 
>> new,
>> > clean entity (indeed, default process heap space is always the 
>> same
>> > size!).
>> >
>> >>   - How can I make Bob to survive a connection loss?
>> >
>> > Look above: it SHOULD survive.
>> >
>> > Can you please start SASL (application:start(sasl) from the shell)
>> > and see if shell log puts some further information?
>> >
>> > Vincenzo
>>




More information about the erlang-questions mailing list