[erlang-questions] Distributed Process Registry

Sun Feb 8 23:14:06 CET 2015

On 02/08/2015 01:41 PM, Loïc Hoguin wrote:
> On 02/08/2015 10:10 PM, Michael Truog wrote:
>>> Is there any reason why using these process groups be beneficial in my
>>> use case?
>> The main reason is that you avoid the need to resolve state conflicts
>> when global state gets merged after a netsplit.  With pg2 and cpg, all
>> the state relevant to the local node is stored locally and remote state
>> gets merged as nodes are added.  When a node dies, its pids are removed,
>> as expected, but there is no need for centralized global state.
>
> I'm curious.
>
> When a node connects it sends its state to all the nodes it connects to? And when a process group gets registered it sends this info to all the nodes?
>
Yes, basically.  Node connections are monitored and when a new node appears (after a nodesplit or a new node connection) the cpg scope process sends its state to the new node's cpg scope process.  When cpg is first started, it also makes sure to let the other nodes cpg processes know it exists, so they will send their state to it.  The state is merged, so that remote pids are stored and monitored.

When a cpg process is added to a group, it is added locally, and the addition is sent asynchronously to remote nodes (if they don't receive it, a netsplit or reliability problem is happening anyway, so that will get resolved as described above).  This functions like pg2, except that pg2 only uses a single scope and makes the addition synchronous, relying on the global module for a global transaction lock.  The cpg approach avoids a global lock by requiring that the cpg scope process be locally consistent (as if the single Erlang process functions as a mutex lock), which means that the cpg process is only dealing with local node pids (you can not add remote pids to the local cpg process).  There is a macro to get the pg2 approach in cpg (undefine GROUP_NAME_WITH_LOCAL_PIDS_ONLY), but it is better to use that restriction to avoid a dependency on the global module.  The cpg return values are the same as pg2, so you can switch between them if you aren't using cpg specific 
features, like scopes.

cpg doesn't require a group be created before the join, but pg2 does.  So cpg usage can rely only upon join/leave for group membership.