[erlang-questions] Ideas for a new Erlang

Fri Jun 27 13:23:08 CEST 2008

2008/6/26 Darren New <dnew@REDACTED>:

I guess I should clarify my initial comment:

> Ulf Wiger wrote:
>> 2008/6/26 Darren New <dnew@REDACTED>:
>>> Ulf Wiger wrote:
>>>> I think we can stipulate that when serializing a pid, perhaps storing it
>>>> on disk, then re-creating it and trying to use it, all bets are off
>>>> anyway.*

What I meant was that if we're discussing weird things that could happen
if we'd allow automatic GC of a process that's blocked, not registered,
and were there are no known references to the pid, we shouldn't
necessarily pay too much attention to the cases were the pid has
been serialized e.g. using term_to_binary(), because this is not really
different from the problems that can occur already.

We must always allow for the even that the process dies. If it dies (for
whatever reason), and there are no references to the pid, it is ok for
the runtime system to reuse the pid. Thus, using a remote pid or
a pid that has been serialized, there is always a (very slight) chance
that it will now refer either to a non-existing process, or some other
process entirely.

Now bearing in mind that we often have distributed systems, and
no distributed GC, it is of course an excellent idea to not reuse pids
immediately.

>> In that case, the remote holder of the pid cannot rely on the pid referring
>> to the same process if it's used some time later.
>
> How long is "some time"?

This is implementation and configuration dependent, as well as dependent
on "churn". Using the -P flag, the user tells the runtime system how much
memory to allocate for processes. I'm not sure whether this also affects
how soon pids are reused, or whether the entire address space for pids
is always used. Either way, it may affect how soon pids are reused, but
the time ought to be long enough that it should not be a problem for all
recommended uses of pids.

> We're conflating "reusing the PID" and "GCing a process which we don't
> think anyone will ever wake up". If I GC the process as soon as there's
> no local or remote reference to the PID, I might wind up GCing the
> process while the data representing the PID is on the wire on its way to
> the node going to use it.

As ROK wrote, the suggestion wasn't made seriously to begin with,
but as with any form of GC, one would of course have to come up
with a reasonable way of determining when it is safe to remove the
process. This would have to take into account normal use of pids
in a distributed environment.

Personally, I don't think it would be feasible to introduce such
semantics anyway, since it would rarely be useful, and might
sometimes cause very strange behaviour.

>> It should monitor the process in order to detect whether it dies.
>
> Again, not what I'm talking about. Erlang gives me ways to mostly do
> this unreliably. (If Erlang had a reliable way to do it, we wouldn't
> need mnesia:set_master_nodes, for example.)

I don't follow your reasoning here. Erlang offers a reliable way of learning
whether a process is no longer reachable. The concept of mnesia master
nodes is meant for a situation where full analysis of the problem cannot
be performed in a generic way.

If Erlang is able to tell /why/ a process is no longer reachable, it will
provide that info, but there are obviously cases where it simply isn't
possible to know why. In these cases, it can only tell you that the
other node is not responding (for whatever reason), and thus all its
processes are unreachable. Mnesia will indicate if it detects that
there has been a network partitioning, but it simply cannot resolve
the situation for you. It's a pathological case for which there may
be acceptable application-specific solutions. That's why there's
a facility to set master nodes in mnesia.

There are lots of situations like that in distributed processing where
the runtime system cannot possibly give you guarantees or resolve
the situation automatically. Besides, erlang processes are usually
part of a supervision hierarchy, which means that there will be at
least one stable reference to each process (the monitoring from its
closest supervisor).

>  > but once a process /has/ died, and all known (local) references are
>> gone, some other process may reuse that pid. This is why storing pids
>> persistently is a very bad idea.
>
> Except that storing pids "persistently" in variables is how one uses
> them. No matter where you store them, in variables or in disk files, you
> have to get rid of them when you get a monitor or link message saying
> the process exited, and you have to deal with the fact that you might
> get those messages when the process has, indeed, not yet exited and
> indeed doesn't know those messages have been sent. I'm not sure that
> storing a pid in a table is particularly more difficult than storing a
> pid in the variables of a process that's not expected to ever exit.

I disagree. To take a related example, mnesia offers different table types
with different persistence requirements. A very good rule is to avoid mixing
data elements with different persistence requirements in the same table.
In remotely managed systems (e.g. via SNMP), you may well keep a
"current alarms" table in mnesia. This table should be created such that
its content doesn't survive a system restart, for the simple reason that all
alarms will be re-generated after such an event, and the old alarms are
useless anyway. To further the analogy, a monitoring system must also
know that if the managed system restarts, all references to current alarms
can, and must be, thrown away.

BR,
Ulf W