[erlang-questions] Distributed Process Registry
Mon Feb 9 00:51:10 CET 2015
On 02/08/2015 03:27 PM, Ulf Wiger wrote:
>> On 08 Feb 2015, at 23:57, Michael Truog <mjtruog@REDACTED> wrote:
>> Due to this lack of consistency it is incorrect to say global or gproc are partition tolerant. Yes, they can handle a netsplit occurring, but they do not tolerate the failure in a consistent way.
> This is an interesting definition of ‘tolerant’. So you’re saying that because global and gproc offer a choice of methods to handle a netsplit, they are _less_ tolerant than if they had only one method? Perhaps I misunderstood.
If resolving the separate chunks of data that exist after a netsplit requires user source code, to pick which data is "correct", that can not be consistent and is an arbitrary process (ad-hoc, based on your use-case). I don't believe that is being partition tolerant, but is instead ignoring the problem of partition tolerance and telling the user: "you should really figure this out".
> I agree that it can be a problem in a given system that different components automatically try to resolve an inconsistency using potentially different strategies. For this reason, I’ve long argued that one should have one master arbiter; the other systems need to be able to adapt. Otherwise, the different conflict resolution decisions can actually _cause_ inconsistencies from a system perspective.
This stance appears to be contradicted by usage of gproc properties. You can have automatic conflict resolution that does not cause inconsistencies, i.e., it does not need to be a manual process that requires a master arbiter.
> I should note that gproc really only does ordered conflict resolution in the uw-locks_leader branch (and I just spotted a bug in it).
> But again, I don’t think one should compare distributed group membership solutions with gproc’s unique names. They are not the same, just as comparing a transaction-based DBMS with an dynamo-style KV-store is an apples-to-oranges comparison.
> If you really need to keep one unique instance of something in a distributed system, you will have a consistency challenge whether you want it or not. There will be tradeoffs, and as far as I know, there is no single tradeoff that is the best option for every situation. Thus, different options for different use cases.
> But properties in gproc are not unique. In the local case, you don’t even need to involve the central gproc server when registering a property (except to ensure monitoring). When merging data after a netsplit, the node when a process resides is considered the authority on what properties exist for the process. This is safe, since only the process itself can register properties.
> Thus, a process group will be all processes that share a given property. There is no tricky conflict resolution after netsplit in that case.
Then that appears to be similar to cpg usage, since cpg is using the node where the process resides as the authority on which groups it is a member of. The gproc unique names are what require manual intervention for obvious reasons, since they are unable to tolerate a partition in a consistent way. I understand this difference is based on use-cases, but I think it is an important distinction.
> Ulf W
> Ulf Wiger, Co-founder & Developer Advocate, Feuerlabs Inc.
More information about the erlang-questions