[erlang-questions] ANN: locks (beta) - distributed lock manager and leader election

Ulf Wiger ulf@REDACTED
Thu Oct 10 21:57:14 CEST 2013


Hi Geoff,

For the time being, I included Thomas Arts's old description of the algorithm in the README.md (the "Verification" chapter).

That description deals with the locking algorith. The locks_leader behavior simply has all candidates trying to claim a specific lock on every visible node (it would be possible to add an option to restrict the list of candidate nodes). Since the locks_agent will tell its client (a candidate) which other agents & clients are vying for the lock, all candidates can be derived from the lock queue. And since the algorithm will resolve deadlocks through 'lock surrender', one agent will always end up holding all locks*, and thus becomes the leader.

Workers simply monitor the lock, which means they also get the lock updates. Once they get an update, they find any new candidate and send it a message, introducing themselves as workers.

BR,
Ulf

* It's essentially the same lock on all nodes, but each lock+node instance counts as one lock. This is also why a deadlock situation can occur.

On 10 Oct 2013, at 21:48, Geoff Cant <geoff@REDACTED> wrote:

> Hi Ulf, do you have a synopsis of the protocol/distributed algorithm the locks_agent run?
> 
> I had a lot of trouble trying to figure out which protocol/algorithm the gen_leader variants were using the last time I looked at them. Having a description of which algorithm they were implementing (say 'asynchronous bully leader election') and the modifications they made (protocol messages to change the candidate set) would be a significant help to understanding the codebase.
> 
> -Geoff
> 
> On 2013-10-09, at 12:41 , Ulf Wiger <ulf@REDACTED> wrote:
> 
>> 
>> Right after I sent this, of course I found some issues with the locks_leader and recovery from netsplits - It did it, but it wasn't exactly elegant. Did I mention it was a Beta release…?
>> 
>> Now it's a lot better, at least in my tests.
>> 
>> There is now also some documentation of the locks_leader callback under
>> https://github.com/uwiger/locks/blob/master/examples/doc/test_cb.md
>> 
>> and the gdict example actually merges the dictionaries after netsplit, albeit in a very non-optimal way.
>> 
>> 
>> The real nerds may take a look at the locks_watcher code,
>> 
>> https://github.com/uwiger/locks/blob/master/src/locks_watcher.erl#L12
>> 
>> which expands a pseudo-call in the locks_agent into a remote spawn of an interpreted
>> function, which watches for the locks_server to come up on another node.
>> 
>> What I think is a bit cool is that the interpreted code is actually implemented as regular
>> code in this module (lines 22-65). The parse_transform lifts and abstracts the code into
>> something that becomes interpreted at runtime.
>> 
>> The end result is that locks_agents react immediately to a locks_server starting on
>> another node - no polling required.
>> 
>> BR,
>> Ulf W
> 
> 
> 
> 
> 

Ulf Wiger, Co-founder & Developer Advocate, Feuerlabs Inc.
http://feuerlabs.com






More information about the erlang-questions mailing list