[erlang-questions] Fun with the distribution mechanism in Erlang

Lars-Åke Fredlund fred@REDACTED
Fri Mar 9 15:37:43 CET 2007


Hans Svensson and I were investigating the restarting behaviour of nodes 
in Erlang;
wanting to know for instance whether "are pids created on a node with 
name "n" comparable to a pid created on
a node with the same name after a restart"?

(all, of course, for the noble purpose of eventually really 
understanding the detailed
semantics of distributed Erlang :-)


Anyway, after experimenting Hans came up with the following program that 
works a
bit unexpectedly (attached as "strangeCommunication.erl").

Three nodes are used (n1,n2,n3);  n2 is restarted automatically whenever 
it halts (by the shell file "restartingErlangshell.sh").

When strangeCommunication:run() is started it performs three times on 
node n1:
   - starts a process on n2
   - halts node n2

The result is a list of three (dead) process identifiers. We are sure 
they are dead since we have
received exit messages regarding them.

We then spawn a new process on n2 which just echo received messages to 
the sender.
The three pids (of dead processes) are communicated to a newly spawned 
process on node n3,
which tries to communicate with any of the dead processes. And rather 
surprisingly one of the communications succeeds!

(test@REDACTED)1> strangeCommunication:start().
Killing and restarting node n2@REDACTED
Killing and restarting node n2@REDACTED
Killing and restarting node n2@REDACTED
Got: {'EXIT',<4981.41.0>,normal}
Got: {'EXIT',<4981.41.0>,normal}
Got: {'EXIT',<4981.41.0>,normal}
Trying to communicate with: <4981.41.0>
(<<131,103,100,0,10,110,50,64,106,101,122,97,98,101,108,0,0,0,41,0,0,0,0,3>>) 

Recieved 6 from <4981.41.0>   (!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!)

(<<131,103,100,0,10,110,50,64,106,101,122,97,98,101,108,0,0,0,41,0,0,0,0,3>>) 

Trying to communicate with: <4981.41.0>
(<<131,103,100,0,10,110,50,64,106,101,122,97,98,101,108,0,0,0,41,0,0,0,0,1>>) 

No reply!      
Trying to communicate with: <4981.41.0>
(<<131,103,100,0,10,110,50,64,106,101,122,97,98,101,108,0,0,0,41,0,0,0,0,2>>) 

No reply!  

We all know that pids eventually wrap around, but it seems that when 
nodes are restarted, pids are going be reused much earlier than one 
would think. Maybe it would be a good idea to permit more restarts than 
three before reuse?
(it seems like three is the magic number being used in the runtime system)


Lars-Åke and Hans

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: strangeCommunication.erl
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20070309/09d70fb6/attachment.ksh>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: restartingErlangshell.sh
Type: application/x-shellscript
Size: 132 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20070309/09d70fb6/attachment.bin>


More information about the erlang-questions mailing list