[erlang-questions] -pidname( <atom> ) Was: Why we need a -module() attribute?

Mon Feb 29 04:21:56 CET 2016

On 27/02/16 3:53 am, Valentin Micic wrote:
>
> Phrase "atomic variable" in a context of Erlang is a bit 
> counter-intuitive, maybe even to a point of being an oxymoron.

In the context of multiple processes, "atomic variable" is most definitely
NOT an oxymoron.
> And when you say COMPLETELY INACCESSIBLE FROM THE OUTSIDE, I assume 
> this would exclude read access as well.

Exactly so.
> If the whole point of registering processes is to be able to provide 
> for a static way of sending a message to a process, how would that 
> work under these circumstances?
There are two reasons for using the registry, and the needs of one
reason make for a mechanism which is dangerous for the other,
which ironically seems to be its main use case.

[Private]: a module starts one or more processes.
Code within the module needs a way to locate that process/those
processes, but the intention is that all communication with it/them
should go through functions in the module.

    This is the use-case that -pidname is designed for.
    It is precisely like each module having its own registry with a
    fixed set of keys.

[Public]: a process needs to be visible to other modules, possibly
even to other nodes via the {node,registry_key}!... technique.

    This is not addressed by -pidname and remains exactly as it is now.

It's not about eliminating the registry, but about providing a SAFER
alternative for processes that don't need to be public
> Are you saying that one can have more than one -pidname declaration 
> per module?

Yes, certainly.  One for each process you wish to name.
>
> Don't you think this would mean that you would have to know in advance 
> (that is, compile-time) about all the names the process powered by 
> this module would ever have?
> If so, I am quite surprised that you do not see a problem with that.

I don't see a problem with that because if you look at actual Erlang code,
module authors *DO* in point of fact know EXACTLY how many names
they need, very often.

Sure, modules may and some do create lots of processes.
But it's unusual for a module to flood the registry with a large
number of new entries.  Behaviours like gen_server.erl
But it would be an extraordinary module that flooded the registry with a
large number of new entries.

Since a process can act as a concurrent data structure (such as a
dictionary, hint hint), if you should happen to have a module that needs
a large number of private registry entries, you only need one -pidname:
register your concurrent data structure process.

Consider, though, asn1rt_driver_handler.erl,
which uses asn1_driver_owner and asn1_driver_port.
Only asn1rt_driver_handler.erl mentions asn1_driver_owner.
Only asn1rt_driver_handler.erl and asn1rt_per_bin_rt2ct.erl
mention asn1_driver_port; a one-line function added to
asn1rt_driver_handler could make that local too.

The problem with these two things being in a global registry is that
ANY module could tamper with those entries or do
asn1_driver_owner!unload.

*That* is what the -pidname proposal is about.

> Also, seeing that module name and named pidname variable are not the 
> same, what would happen if two different modules uses the same 
> name for a pidname?

What part of "local" and "private" is hard to understand?
Why would any module know or care what pidnames another module uses?

> Of course, you may solve this problem by indicating that these are 
> COMPLETELY INACCESSIBLE FROM THE OUTSIDE, but again, assuming that the 
> reason for registration is to be able to have a "static" way to send a 
> message to a process, what would be the point of having them in the 
> first place if they are COMPLETELY INACCESSIBLE FROM THE OUTSIDE.
Once again, do bear in mind that the -pidname proposal is an ADDITION to
the existing registry, not a complete replacement for it.  If you have a 
process
that you *want* other modules to be able to talk to directly, just keep 
on using
the existing registry.

A lot of registry entries created now *would* be private if there were 
only some way to make
them so.
> Given the Erlang standard syntax, it stands to reason that 
> pidname:set( <the atom you declared>, self() ) invokes a function set, 
> that belongs to another module (pidname), and, as such, goes against 
> your earlier stipulation that -pidname declarations are  COMPLETELY 
> INACCESSIBLE FROM THE OUTSIDE.

Oh please.  The special built-in functions necessary to make the approach
work at all do not honestly count as *outside*.  They're *inside*; part 
of the
system machinery.
> And, if they are accessible from the outside via pidname:set( <the 
> atom you declared>, self() ), how is that any safer than the 
> registration mechanism already in place?

Because calls to the pidname: modules would be special syntax recognised by
the compiler.
> See a contradiction that I attempted to point out above...

Nope.  Can't see it.
> I wish I can understand this as easily as you can.

The key thing is that
  - there are modules that create processes that they WANT to be public
     For those modules, the existing registry is fine.
  - but there are also modules that create a fixed set of named processes
    that they register only so that they can find them again, which WOULD be
    local for safety if only there were some way to do that.

The details of the -pidname proposal are not important.
The idea that the second class of modules (or even the second use; a module
might want to do both things) exists in abundance in the OTP sources and
deserves support, THAT is the key point.
> In ideal situation, you would have 1:1 correspondence between a module 
> and a process.
Why?  I've already pointed out asn1rt_driver_handler, which has TWO
processes it needs to know.  It's not the only one.
> However, this is rarely the case; e.g. what we register as a 
> gen_server is not the gen_server module, but the actual behavior that 
> runs on top of gen_server. So, which -pidname variable should we use?

I am having a seriously hard time understanding why you think there is a 
question here.
If -pidnames existed, gen_server would only be able to use -pidnames
declared inside the gen_server module, that is, it would only be able to
register *locally* processes it intended to keep private.  Local, private.
gen_server would not be able to use any other module's pidnames.

There is a way out.
You could have an extended interface to gen_server in which the client 
module
provided a call-back providing indirect, controlled, access to selected 
pidnames.
You'd need to ensure that *only* gen_server could call that call-back, 
but that's
what -export_to (also proposed long ago) is for.

> And let me just reiterate a point that I've made commenting on your 
> Second point above -- are we to provide gen_server with as many names 
> as we have behaviors?
Is that a serious question?
How often does it need saying?  -pidname is not a REPLACEMENT for
the registry we already have.  It is supposed to offer something DIFFERENT
from that, and so ADDITIONAL to it.  -pidnames aren't *supposed* to do
everything that the registry does; in fact not being ABLE to do that is what
-pidname is all about.

> Of course not. However, if we have a module gen_server declaring a 
> -pidname( gen_server ), and if we send a message to a gen_server, 
> where would this message go?

You've lost me.  There is no magic connection between a -pidname in some 
module
that happens to be called gen_server and "a gen_server", unless THAT 
module contains
code that establishes such a connection, and then a message
pidname:get(gen_server) ! whatever
gets sent to whatever process that module's author chose to bind that 
pidname to.

> The point is, you cannot possibly predict all deployments any given 
> module may have.
Sorry to swear, but
what the hell does that have to do with the -pidname proposal?
The -pidname proposal is all about HIDING LOCAL STATE that currently has 
to be
made public.  For the purposes for which -pidname was designed, you 
don't know
and you don't *care* about "all deployments any given module may have" 
EXCEPT
that you don't want them tampering with your internal name::process 
bindings.
In fact, the fact that you cannot make this prediction you've dragged in 
is the
very reason -pidname is needed:  if I want to refer to a process as fred 
within one
of my modules, I currently have no defence against you using the same name.

It would not surprise me if someone could come up with a better way to 
protect
names for processes that should not be subject to tampering. Arguably Lawrie
Brown's Safe Erlang "sandboxes" could do this by having per-sandbox 
registries.

I spent months trying to come up with something that could be fast and 
safe, and
then put it aside hoping someone else would do better.