[erlang-questions] On Per-module process registration

Thu Feb 11 02:47:51 CET 2010

I've been reading the EEP and trying to grasp it, but this very post really
made me grasp what you wanted. I must say the supervisor does sound like a
good place to place the registry (I've found myself wanting to use this very
design quite a few times already.)

This works when you want to address a particular part of a group of
processes from the outside, but one of the concerns I have with that is that
the supervisor's children who need to access the registry to communicate
between themselves then need to be aware of the supervisor's presence and
the names which are given there. It sounds like an indirect circular
dependency Erlang programs could do without.

On Wed, Feb 10, 2010 at 8:29 PM, Richard O'Keefe <ok@REDACTED> wrote:

>
> On Feb 10, 2010, at 10:46 PM, Anthony Shipman wrote:
>
>  I found the pid_name EEP a bit confusing and I'm not sure that it solves
>> the problems that I have.
>>
>
> I will be happy to revise it once I know what is confusing about it.
>
>
>   Typically I will have a master gen_server
>> that works with a variable number of slave gen_servers each running
>> the code of the same module.
>>
>
> I don't quite understand what you mean here.
>
> Since the -pid_name proposal deals with a *static* set of named
> "encapsulated within modules but shared between processes and
> automatically appropriately locked pid-valued mutable variables",
> it _doesn't_ handle a variable number of anythings.
>
> The problem it was intended to solve was the problem of code
> written according to the examples in Erlang books where a module
> controls one process and registers a name for it so that the
> module can communicate with that process, but far from any
> desire for other processes or modules to find it, the possibillity
> of other processes or modules sending messages to that process
> creates a vulnerability.
>
>
>  The master wants to send a message to a
>> slave. It can't use a pid since the slave might die and be restarted
>> by its supervisor at any time.
>>
>
> Why can't the master keep a dictionary of slaves, keyed by whatever
> it wants, as part of its state?  Since the master has to be informed
> of the slaves' deaths anyway, it can update the dictionary at the
> same time, no?
>
>
>  So I must synthesise an atom for a name
>> that looks something like '<scope>/<slave_id>' and put it into the node's
>> registry. I worry about the overhead of looking up such a name when the
>> registry has 100,000 processes in it.
>>
>
> The process registry is a hash table.
> Accesses to it need to be locked.
> I tried a wee experiment:
>   - create N processes
>   - send K atomic messages to each
> two ways: not registered, Pid!..., and registers, Name!....
> On an Intel Core 2 Duo Macintosh laptop, +S2:2,
>
>   * send via Pid  4.97 microseconds (N=20k) 4.62 (N=100k)
>   * send via Name 7.06 microseconds (N=20k) 6.92 (N=100k)
>
> It looks as though it scales fairly well.
>
> The question is what merit you see in unrelated
> processes being able to easily do
>
>        exit(whereis('<scope>/<slave_id>'), kill).
>
> That's what my EEP is about.
>
>
>>
>  I've thought that it would be nice if the supervisor could take care
>> of sending messages e.g. have supervisor:call_to(ChildID, Msg).
>>
>
> That seems the obvious way to do it.  After all, if you're using
> OTP supervision, each child _has_ an Id in its child_spec().
> We have
>        supervisor:terminate_child(SupRef, Id)
>        supervisor:delete_child(SupRef, Id)
>        supervisor:restart_child(SupRef, Id)
> So how hard could
>        supervisor:send_child(SupRef, Id, Message)
> be?
>
> send_child(Supervisor, Name, Message) ->
>   call(Supervisor, {send_child, Name, Message}).
>
> handle_call({send_child, Name, Message}, _From, State) ->
>    case get_child(Name, State)
>      of {value, Child} when is_pid(Child#child.pid) ->
>             Child#child.pid ! Message,
>             {reply, {ok, Message}, State}
>       ; {value, _} ->
>             {reply, {error, not_running}, State}
>       ; _ ->
>             {reply, {error, not_found}, State}
>    end;
>
> WARNING: this code has not been tested.
> Compiled, yes.  Tested, no.
>
> This seems like such an obvious thing to do that I must be
> missing something about how supervision is _supposed_ to be
> used.
>
>
>  The supervisor then becomes the scope for the naming system.
>>
>
>  But that could
>> be awkward in a non-trivial supervisor tree.
>>
>
> I think the main question about that would be whether it is
> in general advisable, and I'd like to stay out of that
> argument.  If I don't understand why send_child/3 isn't
> there already, I _certainly_ am not competent to argue
> why send_descendant(Supervisor, [Id1,...,Idn], Message) isn't.
>
>
>>
>> Here's an in-between idea. Let's make a registry be a first class
>> object. It will function as a scope for a set of process names. The API
>> would look something like:
>>
>>   registry:new()
>>   registry:spawn_link(Registry, Name, MFA)
>>   registry:send_to(Registry, Name, MFA)
>>   registry:call_to(Registry, Name, MFA)
>>   registry:cast_to(Registry, Name, MFA)
>>
>> The spawn function would be atomic wrt spawning and registration. The
>> master would create a registry for itself and spawn each slave within
>> its scope.
>>
>
> A supervisor already has a data structure doing this job.
> It happens to be implemented as a list of #child records,
> but that's the job it's doing.
>
> You can't _find_ the master's registry without asking it,
> and on encapsulation principles, I can't see the master
> handing it out to strangers.  (I know "Joe Hates OO", but
> the Law of Demeter seems to apply here.)  You might as
> well ask the master to do the job for you.  Just replace
> "registry" and "Registry" with "supervisor" and "Supervisor",
> add a few support functions, and you're done.
>
> The principal difference between a dynamically created
> registry like this and a supervisor ould seem to be
> that a supervisor will try to restart dead children,
> while a registry will just scratch them off its list.
> It's already the case that a supervisor can be told not
> to resurrect specific children.
>
>
>>
>  The distinction between a registry and a supervisor could be blurred by
>> giving the registry the ability to restart spawned processes that crash.
>>
>
> One might as well go all the way and unify dynamic registries and
> supervisors completely, and call them supervisors.
>
> There is one thing that a dynamic registry could do that a
> supervisor cannot do, and that is to hold unrelated/uncontrolled
> processes.
>
> It looks to me as though writing a dynamic registry module should
> not be incredibly difficult, given a design; it's just another
> plain old Erlang module.
>
>
>>
>  Another issue is whether a registry could be accessed from more than
>> one node.
>>
>
> This of course is heading in completely the opposite direction from
> my proposal, which is all about _hiding_ processes, not revealing
> them in interesting and useful ways.
>
> Can gproc do any of the things you need?
>
>
>
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>
>