[erlang-questions] Distributed Process Registry

Sun Feb 8 21:59:48 CET 2015

Hello Michael,
I know of those but I've left them out as I do not need group mechanisms: I'm not interested in broadcasting messages to multiple devices. Just a 1-to-1 messaging.

Is there any reason why using these process groups be beneficial in my use case?

Thank you for your input,
r.

> On 08/feb/2015, at 21:51, Michael Truog <mjtruog@REDACTED> wrote:
> 
>> On 02/08/2015 11:56 AM, Roberto Ostinelli wrote:
>> Dear list,
>> I have 3 interconnected nodes to which various devices connect to.
>> Once a device connects to one of those nodes, the related TCP socket events are handled by a device_loop process on the node that it originally connected to.
>> 
>> Every device is identified via its id (a binary). I need to enable communication from one device to the other based on these ids, even within different nodes. I have around 150k device processes per node (so up to 500k in total).
>> 
>> So, I basically need a global process registry. Not new, but haven't used one in a while now.
>> 
>> As far as I can tell, my main options to send messages from one device process to the other based on their id are the erlang global module, ulf's gproc, or implement a custom solution based on, for instance, mnesia in ram only.
>> 
>> 
>> I was first thinking of leaning towards using the erlang global module, since register_name/2,3 now also allows general terms to be used as Name. The advantages I see:
>> It is a simple built-in mechanism.
>> If a node goes down, the global names registered on that node are unregistered automatically.
>> If a new node is added, the global names registered are propagated automatically.
>> The cons:
>> I always feel that process registration should be used to identify long-running services.
>> I don't know if 500k is an acceptable number (i.e. if the global module is made to support my use case).
>> 
>> I also looked into gproc. The advantages I see:
>> Actively maintained, it seems to have been built for my use case.
>> The cons:
>> For the distributed part it relies on gen_leader. I've heard too many horror stories on gen_leader. Maybe that's not a thing anymore.
>> Not sure what happens if a node goes down / a new node is added.
>> 
>> I've considered a custom solution based on mnesia distributed ram-only tables that would store the pids of the device loops based on their binary id.The advantages I see:
>> Mnesia will take care of distributing, handling down events, etc.
>> The cons:
>> I need to reinvent the wheel and ensure that when a node goes down, all the device entries in the distributed mnesia tables related to that node are removed.
>> 
>> 
>> Has someone recently implemented a distributed process registry and can shed some light for me?
>> 
>> Thank you in advance for your advice ^^_
>> r.
>> 
>> 
>> 
>> 
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
> You are missing a few options:
> 
> http://www.erlang.org/doc/man/pg2.html
> * Any term can be used for a name
> 
> https://github.com/okeuday/cpg/
> * By default uses string (list of integer) names, but can be changed with group_storage application env setting (e.g., to dict)
> * Supports any number of scopes, which are atoms that are used as       locally registered cpg process identifiers (pg2 only supports the single global scope stored in ETS)
> * Supports the via syntax, like gproc does, with variations that       allow pools to be created (https://github.com/okeuday/cpg/blob/master/test/cpg_test.erl#L83-L104)
> 
> Both pg2 and cpg allow you to avoid centralized global state (the state used in gproc, locks_leader, mnesia, global) so that netsplits do not require an arbitrary process to resolve state conflicts.  That is very important for reliability.
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150208/d6a868b5/attachment.htm>