[erlang-questions] State Management Problem

aman mangal mangalaman93@REDACTED
Sat Dec 19 08:29:48 CET 2015

I think, that makes sense. We have to make trade-offs to make a system

Though, the idea that I was pointing to is that, just like FLP proof
<https://groups.csail.mit.edu/tds/papers/Lynch/jacm85.pdf> states that
there is no way to know if a process has really failed or the messages are
just delayed but Erlang provides a practical and working way to deal with
it. Similarly, can a language or standard libraries like OTP give us
practical ways to achieve trade-offs among Consistency, Availability and
Partition Tolerance? I feel that the same problem (the CAP trade-off) is
being solved in each system separately.

As far as I understand, Joe Armstrong, in his thesis, argues that a
language can provide constructs to achieve reliability and that's how
Erlang came into picture. I wonder whether CAP trade-offs can also be
exposed using some standard set of libraries/language.

- Aman

On Sat, Dec 19, 2015 at 12:54 AM zxq9 <zxq9@REDACTED> wrote:

> On 2015年12月19日 土曜日 00:42:26 aman mangal wrote:
> > Hi everyone,
> >
> > I have been reading a few blogs on Erlang lately and some of them
> strongly
> > points out that Erlang solves the reliability problem very nicely for
> > distributed systems. But when I really think about it, Erlang solves only
> > half of the reliability problem. It creates duplicate actors, handle
> their
> > crash by linking and supervision but it does not handle the distributed
> > state management problem at all. If I go back and look at the thesis of
> Joe
> > Armstrong, it also talks about everything as an actor model. I am
> wondering
> > what assumptions were made about state management at the time of creation
> > of the language as well as what are good ways to handle the other half of
> > the reliability problem when it comes to Erlang? I understand that this
> is
> > a hard problem to solve but at the same time, it seems to be a generic
> > problem for Distributed Systems. Does/can Erlang provide any generic
> > solutions?
> Short answer:
> No.
> tl;dr:
> Three fundamental problems exist: consistency, availability, partition
> tolerance. Pick two. The Rules forbid solving all three at once.
> Discussion:
> The problems of distributed data are threefold, and only two can be solved
> at a time unless you happen to know how to either freeze time, open a
> wormhole or beat the speed of light. This is why there are no generic
> solutions to distributed data, only solutions that make tradeoffs of
> various types, and different tradeoffs are best suited to specific
> situations -- hence the impossibility of genericizing any solution.
> The basic problem is described in the CAP theorem. It says a system can
> have:
> - Consistency
> - Availability
> - Partition tolerance
> but that you can only have 2 at once.
> That doesn't mean that all parts of your system have to make the same
> tradeoff with regard to state management, but again, the fact that a
> tradeoff must be made is indication that there can never be a truly generic
> solution to this.
> What Erlang lets you do is decide *for sure* whether something is running
> or crashing, instead of handling random faults in ad hoc ways. Tolerance
> for distributed failures is *also* something Erlang leaves up to the
> programmer to figure out, because the same CAP problem that exists in
> distributed state management also applied to the system's view of the state
> of its own operational capacity. (Does every node know what the state of
> every other node? That's data, too!)
> So this is a hard problem. In the real world *most* systems seem to be
> designed to start involving humans once partitions occur (though most have
> the ability to run in a degraded state of service until a sysop fixes
> things). In the imaginary world where there is a software package to cure
> every ill, all our theories are correct, software is bug-free and network
> latency is zero this is handled automatically by correct implementations of
> logically flawless leader election algorithms that always work and a second
> partition never occurs in the middle of partition resolution. But we don't
> live in that world.
> Partition tolerance is a hard problem, maybe the hardest to code around,
> so most systems seem to make a tradeoff that sacrifices (some level of)
> partition tolerance in exchange for (general, but maybe deferred)
> consistency and (an absolutely insane focus on) availability.
> -Craig
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20151219/3fd79db6/attachment.htm>

More information about the erlang-questions mailing list