[erlang-questions] conditional pub-sub in gproc

Ulf Wiger <>
Sun Jun 17 03:58:06 CEST 2012


The main problem is that, while gen_leader is a very well verified implementation of Stoller's leader election algorithm (Hans Svensson got his PhD on the subject), the algorithm doesn't handle netsplits. One can of course "hack" the algorithm, but leader election algos all have a specific band of circumstances within which they are valid. The Stoller algo fits Erlang well, save for the netsplit problem.

(Gen_leader was originally based on a different algo, but Hans proved it unsound given the failure modes that exist in Erlang).

I think the latest mods to gen_leader seem sound, but there need to be hooks to userland, allowing the leader to merge data and establish a new baseline.

My own secret plan is to finish my locking framework that does distributed deadlock detection in a minimal way without a central dependency graph. I've been sitting on that algo since 1993, and Thomas Arts helped me verify it some years ago. Only recently did it occur to me that if two leaders would grab for an election lock after a netsplit, they would deadlock and my algo would automatically resolve it and grant the lock to one of them.

Lots to do right now, but one if these days I may release it. :)

BR,
Ulf W

Ulf Wiger, Feuerlabs, Inc.
http://www.feuerlabs.com

16 jun 2012 kl. 13:43 skrev Tim Watson <>:

> Guys - is there a good discussion/description of the outstanding issues with gen_leader? I'd like to understand its limitations a bit better.
> 
> Cheers,
> Tim 
> 
> On 16 Jun 2012, at 01:58, Ulf Wiger wrote:
> 
>> 
>> Good points.
>> 
>> Please try out the latest version, which includes a few new functions:
>> 
>> gproc:await(Node, Key, Timeout)
>> https://github.com/esl/gproc/blob/master/doc/gproc.md#await-3
>> 
>> gproc:wide_await(Nodes, Key, Timeout)
>> https://github.com/esl/gproc/blob/master/doc/gproc.md#wide_await3
>> 
>> gproc:nb_wait(Node, Key)
>> https://github.com/esl/gproc/blob/master/doc/gproc.md#nb_wait-2
>> 
>> gproc:cancel_wait(Node, Key, Ref)
>> https://github.com/esl/gproc/blob/master/doc/gproc.md#cancel_wait-3
>> 
>> 
>> Note that await/3 also returns the Value, so can be throught of as a 
>> distributed get_value() function.
>> 
>> BR,
>> Ulf W
>> 
>> 
>> On 15 Jun 2012, at 16:25, Loïc Hoguin wrote:
>> 
>>> Sorry I should have been clearer.
>>> 
>>> I'm using global gproc to locate processes that may be on any node of the cluster. But sometimes I run into gen_leader's issues (for example if a node crashes) so I was wondering if I could manage still using gproc for my purposes without gen_leader.
>>> 
>>> Currently it's easy to broadcast information to other local gprocs but not so much to retrieve remote data, similar to what get_value/2 or where/1 would do in a distributed context.
>>> 
>>> It would be nice to be able to have a cluster of local gprocs and easily access them from any remote node, similar to how bcast works for broadcasting.
>>> 
>>> On 06/16/2012 01:18 AM, Ulf Wiger wrote:
>>>> 
>>>> On 15 Jun 2012, at 16:10, Loïc Hoguin wrote:
>>>> 
>>>>> That removes some functionality though, doesn't it. Like :get_value/1 and where/1. How would you suggest using these without global gproc?
>>>> 
>>>> Uhm, those functions work perfectly fine in a local context…
>>>> 
>>>> Can you expand on that?
>>>> 
>>>> Global gproc (gproc_dist) is disabled by default.
>>>> My assumption is that most people leave it that way.
>>>> 
>>>> BR,
>>>> Ulf W
>>>> 
>>>> Ulf Wiger, Co-founder & Developer Advocate, Feuerlabs Inc.
>>>> http://feuerlabs.com
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> -- 
>>> Loïc Hoguin
>>> Erlang Cowboy
>>> Nine Nines
>>> 
>>> 
>> 
>> Ulf Wiger, Co-founder & Developer Advocate, Feuerlabs Inc.
>> http://feuerlabs.com
>> 
>> 
>> 
>> _______________________________________________
>> erlang-questions mailing list
>> 
>> http://erlang.org/mailman/listinfo/erlang-questions
> 



More information about the erlang-questions mailing list