[erlang-questions] ETS tables and pubsub

Christopher Meiklejohn <>
Wed Nov 13 20:27:28 CET 2013


On Wednesday, November 13, 2013 at 2:13 PM, Garret Smith wrote:
> I know Ulf has been doing some work on gproc to make it handle netsplits better. I keep meaning to try it out...
>  
> http://erlang.org/pipermail/erlang-questions/2013-June/074345.html
> For me, I'm using a combination of gen_leader and local-only gproc. My application has partitioned graphs of data flow & process interaction, so I can use gen_leader to manage moving entire graphs between nodes and local gproc for processes within a graph to find each other.

Right, there are two fundamental problems here with the existing gproc:

1. It’s reliance on gen_leader, and it’s problems with deadlocks, dynamic membership and network partitions.
2. gproc’s resolution strategies for conflicting values after resolution of a network partition.

Ulf has done a ton of work on the second, but I haven’t had a chance to look at it myself.  Ulf’s also left some comments [3] on my blog regarding gen_leader and the conflict resolution strategies.
> A generic process registry that handles netsplit and nodes entering and leaving the cluster is a Very Hard Problem(TM). You'd still have to write some bits yourself, like how to merge registries after a netsplit. That's why a lot of application-specific solutions (like mine) exist to exploit the inherent properties of the problem.

Agreed; however if you’re willing to relax your requirements on consistency, there are alternative approaches.  That being said, I recently published a paper on trying to make a more robust, fault-tolerant process registry [1] [2], in which I was able to handle the merge operations through the use of CRDTs, ensuring the registry converged to the correct value.

I have another prototype somewhere which used Riak Core to manage replicated gproc instances, which was the original start to Riak PG, however I abandoned it when I realized that portions of the API would be hard to either implement or reconcile, which lead to my current approach.

[1] http://dl.acm.org/citation.cfm?id=2505309
[2] http://christophermeiklejohn.com/erlang/riak/crdt/2013/06/24/introducing-riak-pg-distributed-process-groups-for-erlang.html
[3] http://christophermeiklejohn.com/erlang/2013/06/05/erlang-gproc-failure-semantics.html#comment-922065906

--  
Christopher Meiklejohn
Software Engineer

Basho Technologies, Inc.  




More information about the erlang-questions mailing list