[erlang-questions] Process surviving disconnect

Anchise Inzaghi erlang@REDACTED
Wed Aug 17 18:57:35 CEST 2011


 Thank you very much Vincenzo. You affirmed my assertion that Bob should 
 survive the disconnect.
 Nevertheless he dies.
 I will point out exactly what I do and maybe someone can spot the error 
 in my code, my setup or my thinking and tell me what I am doing wrong.

 I start a node on gca.local:
 unroot@REDACTED:~$ erl -name 'bob@REDACTED' -setcookie 123

 I start a node on usa.local:
 unroot@REDACTED:~$ erl -name 'alice@REDACTED' -setcookie 123

 I start sasl on bob@REDACTED:
 (bob@REDACTED)1> application:start (sasl).

 I run alice:start/0 on alice@REDACTED:
 (alice@REDACTED)1> alice:start ().
 true

 I look for bob on bob@REDACTED and save its pid:
 (bob@REDACTED)2> whereis (bob).
 <0.65.0>
 (bob@REDACTED)3> Pid = whereis (bob).
 <0.65.0>

 I cut the network cable and wait a minute for the timeout.

 On alice@REDACTED I get the following output:
 =ERROR REPORT==== 17-Aug-2011::10:53:21 ===
 ** Node 'bob@REDACTED' not responding **
 ** Removing (timedout) connection **
 Bob die of noconnection.

 Nice, Alice trapped Bob's death and reported it. I check for Alice:
 (alice@REDACTED)2> whereis (alice).
 <0.42.0>

 Alice is up and running.

 On bob@REDACTED I get the following output:
 =ERROR REPORT==== 17-Aug-2011::10:53:10 ===
 ** Node 'alice@REDACTED' not responding **
 ** Removing (timedout) connection **

 But Bob is dead:
 (bob@REDACTED)4> whereis (bob).
 undefined
 (bob@REDACTED)5> is_process_alive (Pid).
 false

 I really do not understand what is happening.


 Thank you in advance

 Anchise

 Here goes the code I used:

 -module (alice).
 -compile (export_all).

 start () -> register (alice, spawn (fun init/0) ).

 stop () -> whereis (alice) ! stop.

 init () ->
 	process_flag (trap_exit, true),
 	Bob = spawn_link ('bob@REDACTED', bob, start, [self () ] ),
 	loop (Bob).

 loop (Bob) ->
 	receive
 		stop -> ok;
 		{'EXIT', Bob, Reason} ->
 			io:format ("Bob died of ~p.~n", [Reason] ),
 			loop (Bob);
 		Msg ->
 			io:format ("Alice received ~p.~n", [Msg] ),
 			loop (Bob)
 	end.


 -module (bob).
 -compile (export_all).

 start (Alice) ->
 	process_flag (trap_exit, true),
 	register (bob, self () ),
 	loop (Alice).

 loop (Alice) ->
 	receive
 		stop -> ok;
 		{'EXIT', Alice, Reason} ->
 			io:format ("Alice died of ~p.~n", [Reason] ),
 			loop (Alice);
 		Msg ->
 			io:format ("Bob received ~p.~n", [Msg] ),
 			loop (Alice)
 	end.


 On Wed, 17 Aug 2011 12:47:02 +0200, Vincenzo Maggio wrote:
> Hello,
> without further info a debug is rather difficult.
> But let's try to at least start analysis of the problem:
>
>>   - Has this something to do that I initially spawn Bob from the 
>> Alice
>>  node?
>
> Absolutely not: this would hit the very foundation of Erlang, process
> referential transparency. When a process is started is a brand new,
> clean entity (indeed, default process heap space is always the same
> size!).
>
>>   - How can I make Bob to survive a connection loss?
>
> Look above: it SHOULD survive.
>
> Can you please start SASL (application:start(sasl) from the shell)
> and see if shell log puts some further information?
>
> Vincenzo




More information about the erlang-questions mailing list