Looking for an algorithm for reliable broadcast

Gleb Peregud gleber.p@REDACTED
Thu Oct 28 16:07:30 CEST 2010


Hello all

I'm looking for algorithm of broadcasting events to a group of
processes in a distributed setting where nodes may fail. Here are
requirements:

1) Speed is the major factor
2) Reliable (no events may be lost, even if it was delivered just to one node)
3) "Weak" consistency:
  a) Events sent by the same actor are delivered to other actors in
the same order as sent
  b) Events sent by different actors may be delivered to an actor in
unspecified order (delivery order may vary at different recipients)
even if sent with considerable time difference

I'd prefer an algorithm that works in optimistic mode (at most
amortized O(n) messages), while falling back to less-efficient mode
(used to fix local view and/or retransmit missing events) if errors
(missing events, bad ordering, etc.) are detected.

Any ideas? I'd be glad for any pointers here. I've found a lot of
related algorithms out there, but it seems to be a very active field
of research that it's hard to find one which suits the problem best.
Have you heard about an efficient "optimistic" algorithm which solves
this kind of problem?

N.B. Additionally each actor has has to maintain a cache of last N
received events, it should be kept consistent (with the same
requirements as message delivery) and when new actor joins a group
it's cache should to be filled. This problem seems to be related but
secondary to the former.

Thanks!
Gleb Peregud


More information about the erlang-questions mailing list