[erlang-questions] -pidname( <atom> ) Was: Why we need a -module() attribute?

Fri Feb 26 15:53:41 CET 2016

On 26 Feb 2016, at 1:52 PM, <ok@REDACTED> <ok@REDACTED> wrote:

>>> Almost all process registrations should be local to the module that
>>> registers
>>> them.  I proposed
>>> 
>>>     -pidname(<atom>).
>>> 
>>> for declaring a module-local atomic variable to hold pids.  This would
>>> be both
>>> safer than the global registry and more efficient.  The global registry
>>> would
>>> remain for those things that genuinely need to be global.
>> 
>> Presuming that the module name is , say,  test,  how is:
>> 
>> -pidname( test )
>> 
>> different from:
>> 
>> register( ?MODULE, self() )?
> 
> First off, let me repeat that a pidname directive declares
> a MODULE-LOCAL (as in COMPLETELY INACCESSIBLE FROM THE OUTSIDE)
> atomic variable.

Phrase "atomic variable" in a context of Erlang is a bit counter-intuitive, maybe even to a point of being an oxymoron.

And when you say COMPLETELY INACCESSIBLE FROM THE OUTSIDE, I assume this would exclude read access as well.
If the whole point of registering processes is to be able to provide for a static way of sending a message to a process, how would that work under these circumstances? 

> 
> Second, a module name names a module, a pidname atom names a
> pid variable.  They are not the same thing and there is no
> compelling reason for them to have the same name.  Indeed, a
> module may declare as many pidnames as it needs; they cannot
> all have the same name as the module.

Are you saying that one can have more than one -pidname declaration per module?

Don't you think this would mean that you would have to know in advance (that is, compile-time) about all the names the process powered by this module would ever have?
If so, I am quite surprised that you do not see a problem with that.

Also, seeing that module name and named pidname variable are not the same, what would happen if two different modules uses the same name for a pidname?
Of course, you may solve this problem by indicating that these are COMPLETELY INACCESSIBLE FROM THE OUTSIDE, but again, assuming that the reason for registration is to be able to have a "static" way to send a message to a process, what would be the point of having them in the first place if they are COMPLETELY INACCESSIBLE FROM THE OUTSIDE.

> 
> Third, -pidname just *declares* a module-local variable for
> holding pids, it doesn't register anything.  You'd have to
> do something like pidname:set(<the atom you declared>, self()).

Given the Erlang standard syntax, it stands to reason that pidname:set( <the atom you declared>, self() ) invokes a function set, that belongs to another module (pidname), and, as such, goes against your earlier stipulation that -pidname declarations are  COMPLETELY INACCESSIBLE FROM THE OUTSIDE.
And, if they are accessible from the outside via pidname:set( <the atom you declared>, self() ), how is that any safer than the registration mechanism already in place?

> 
> Fourth, just in case you missed it, the whole point is that
> a pidname is strictly LOCAL to its module.  It cannot even be
> seen from the outside.  It's private.  Using the registry,
> every process in the whole node can not only *see* your binding,
> they can *delete* it or *replace* it.  With -pidname, no can do.
> You're safe, and there's no risk of you accidentally clobbering
> someone else's registry entry either.

See a contradiction that I attempted to point out above...

> 
>> 
>> Also, how would you run multiple instances of the same module within the
>> same run-time?
> 
> Why would that even come close to the shadow of the hint of a
> problem?  Each instance has its OWN variable.  That's what
> module-local means.

I wish I can understand this as easily as you can.

When I say instance, well, I actually mean a process, as this is what we register today (and we do that so we can use a symbolic reference instead of pid() in order to send messages to it).
In ideal situation, you would have 1:1 correspondence between a module and a process. However, this is rarely the case; e.g. what we register as a gen_server is not the gen_server module, but the actual behavior that runs on top of gen_server. So, which -pidname variable should we use?
And let me just reiterate a point that I've made commenting on your Second point above -- are we to provide gen_server with as many names as we have behaviors? Of course not. However, if we have a module gen_server declaring a -pidname( gen_server ), and if we send a message to a gen_server, where would this message go?

The point is, you cannot possibly predict all deployments any given module may have.

> 
> So if you have a module that hides a process, you can load a new
> version and start it, and calls still executing in the old copy
> will still be routed to the old process; the registry entry won't
> (can't) be hijacked by the new module instance.
> 
>> Your proposal may appear to solve one problem (that is, if one chose to
>> call it a problem), but appears to introduce at least one more.
> 
> It may well do so, but you have not identified any.
> 
> It's an anachronism, because nobody had ever heard of Java
> when I invented -pidname, but think of
>    -pidname(fred).
> as an analogue of
>    private static Thread fred = null;

There are some problems outlined above. I am sure you would be able to think of more yourself. 
But then again, I may have missed the point completely. It wouldn't be the first time.

Kind regards

V/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20160226/67f5d09e/attachment.htm>