[erlang-questions] Ideas for a new Erlang

Fri Jun 27 21:56:51 CEST 2008

Ulf Wiger wrote:
> What I meant was that if we're discussing weird things that could happen
> if we'd allow automatic GC of a process that's blocked, not registered,
> and were there are no known references to the pid, we shouldn't
> necessarily pay too much attention to the cases were the pid has
> been serialized e.g. using term_to_binary(), because this is not really
> different from the problems that can occur already.

I'm unclear what you mean by "already"? Already in the current Erlang, 
where a process blocked in a receive keeps running, or in a new Erlang 
where a process blocked in a receive can get GCed?

I'm simply pointing out that a process blocked in a receive getting GCed 
when there are "no known references to it" would seem to be very, very 
common. Any time you do
    {Someone,Somewhere} ! {starting, spawn(fun loop/0)}
you're going to risk instantly GCing the processes you just spawned, 
because the only place the the PID exists is in the TCP buffers in the 
kernel.

> We must always allow for the even that the process dies.

A PID blocked in a receive that nobody is sending to isn't going to die 
unless the node dies, which we already can find out without having a 
reference to the pid. If the node dies, we know all the pids on that 
node are invalid.

> Now bearing in mind that we often have distributed systems, and
> no distributed GC, it is of course an excellent idea to not reuse pids
> immediately.

Indeed, the system goes out of its way to let you discover that it's 
going to reuse pids. I don't think it's possible for the system to reuse 
a pid without you being able to discover it has been reused, unless the 
process that gets reused explicitly unlinked itself from you after you 
linked it.

> Either way, it may affect how soon pids are reused,

Except that if you're worried about it, you link to the pid and/or 
monitor the node. Presumedly the runtime won't reuse the pid before 
telling you the previous pid has exited.

> the time ought to be long enough that it should not be a problem for all
> recommended uses of pids.

Hmm. And what are the recommended uses of pids? This I haven't seen 
written down anywhere.  It's not too hard to deduce, and to a first 
approximation it sounds like "don't run the Erlang distribution 
primitives over an unreliable network" to me.

> As ROK wrote, the suggestion wasn't made seriously to begin with,

OK. I was just checking, because he really seems to understand this 
stuff deeply. :-)  I hadn't expected my follow-up to generate any more 
than "yes, it would be very difficult to implement that in reality." :-)

> Personally, I don't think it would be feasible to introduce such
> semantics anyway, since it would rarely be useful, and might
> sometimes cause very strange behaviour.

I think in Erlang that's probably true. There's also a similar language 
called Hermes where you very specifically kill a process by closing all 
its outgoing channels. (The Erlang equivalent would be killing off any 
processes whose PIDs you've passed to it.)

>>> It should monitor the process in order to detect whether it dies.
>> Again, not what I'm talking about. Erlang gives me ways to mostly do
>> this unreliably. (If Erlang had a reliable way to do it, we wouldn't
>> need mnesia:set_master_nodes, for example.)
> 
> I don't follow your reasoning here. Erlang offers a reliable way of learning
> whether a process is no longer reachable.

Yep. That doesn't tell me reliably whether the process died. If the node 
becomes unreachable, then becomes reachable again, I understand that the 
pid is still valid, is it not?

> There are lots of situations like that in distributed processing where
> the runtime system cannot possibly give you guarantees or resolve
> the situation automatically.

I don't disagree. I simply said "I can't reliably monitor a process to 
determine if it has died." You're agreeing in detail, it seems.

>> Except that storing pids "persistently" in variables is how one uses

> I disagree. 

I think you're interpreting "persistent" differently than I meant it. My 
question was more along the lines of "why is it worse to store a PID in 
a disk file I recreate on start-up or a term-to-binary term, than it is 
to store it as a local variable in a function that's expected to run 
indefinitely?"  I.e., there's no real lifetime on PIDs even if you don't 
store them as anything other than PIDs. Hence, the "how long is it 
valid" argument doesn't really mean anything. I'm storing the PID for an 
indefinite amount of time. Sure, it's in volatile storage, but the node 
is going to run for five years without being turned off, so...

-- 
Darren New / San Diego, CA, USA (PST)
  Helpful housekeeping hints:
   Check your feather pillows for holes
    before putting them in the washing machine.