[erlang-questions] Processes & Fault Tolerance

Edmond Begumisa <>
Wed Jan 5 16:17:39 CET 2011


On Tue, 04 Jan 2011 20:52:50 +1100, Joe Armstrong <> wrote:

> On Mon, Jan 3, 2011 at 5:30 PM, Edmond Begumisa
> <> wrote:
>>> Let's start with how we do error recovery.
>>>
>>> Imagine two linked processes A and B on separate machines. A is the
>>> master process.
>>> B is a passive processes that will take over if A fails.
>>>
>>> A sends a stream of state update messages S1, S2, S3,... to B. The
>>> state messages contain enough information
>>> for B to do whatever A was doing should A fail. If A fails B will
>>> receive an EXIT signal.
>>>
>>> If B does receive an exit signal it knows A has failed, so it carries
>>> on doing whatever A was doing
>>> using the information it last received in a state update message.
>>>
>>> That's it and all about it - nothing to do with supervisors etc,
>>
>> Ooooohh! THANK YOU, THANK YOU, THANK YOU.
>
> Thanks.
>
> Another problem in understand is that we assume the meaning of words.
>

Yes. There was some confusion about whether we were all talking about  
replicating valid state or isolating invalid state. Which I guess  
"fault-tolerance" could be interpreted to mean either. But I'm no expert  
on the subject so I shouldn't comment on terminology -- I just knew the  
feature I wanted and was pretty sure it could be done at the process  
level, I just had no-idea how! From now on, I'll stick to the term  
"redundancy" or Ulf's "hot take-over" so it's clearer what I mean.

- Edmond -


> Once upon a time I was talking to a guy and used the word "fault  
> tolerance"
> we talked for hours and thought we understood what this word meant. Then
> he said something that implied he was building a fault-tolerant system
> on one computer.
> I said it was impossible. He said, "you catch and handle all the  
> exceptions .."
>
> I said, "The entire computer might crash"
>
> He said, "Oh"
>
> There was a long silence.
>
> I told him about Erlang ...
>
>
> /Joe
>
>>
>> *Now* I get it. This is precisely what I was trying to understand. The
>> missing link was sending redundant messages. It all makes sense now --  
>> it's
>> so simple. All I have to do to get fault tolerance at the process level  
>> is
>> have a group of N redundant processes waiting for exit-signals and  
>> forward
>> state changes to N-1 of them. I could even send to N/2 and have some  
>> decent
>> tolerance at the expense of consistency of the nodes.
>>
>> To preemptively answer the question I sent Ulf and Mazen... for that
>> distributed database, Process B would be set up to also receive read
>> requests forwarded from Process A but not respond to them unless it  
>> gets an
>> exit signal it can understand like disk_fail_error. Any changes in state
>> would also be forwarded to existing redundant processes.
>>
>> Thanks everyone for your patience in helping me understand  
>> fault-tolerance.
>> I've already started thinking about some cool things I can do with this  
>> in
>> my program.
>>
>> - Edmond -
>>
>>
>>
>> On Tue, 04 Jan 2011 00:36:30 +1100, Joe Armstrong <>  
>> wrote:
>>
>>> I think you are getting confused between OTP supervisors etc. and the
>>> general notion of "take-over"
>>>
>>> Let's start with how we do error recovery.
>>>
>>> Imagine two linked processes A and B on separate machines. A is the
>>> master process.
>>> B is a passive processes that will take over if A fails.
>>>
>>> A sends a stream of state update messages S1, S2, S3,... to B. The
>>> state messages contain enough information
>>> for B to do whatever A was doing should A fail. If A fails B will
>>> receive an EXIT signal.
>>>
>>> If B does receive an exit signal it knows A has failed, so it carries
>>> on doing whatever A was doing
>>> using the information it last received in a state update message.
>>>
>>> That's it and all about it - nothing to do with supervisors etc,
>>> <aside> This is how we do things in a distributed system, obviously
>>> real fault tolerance needs more than one machine to cover the case
>>> where an entire machine fails. This is also why we copy everything, if
>>> an entire machine has failed and we have a pointer to vital data on
>>> that machine then we are in big trouble.
>>>
>>> If it works like this in the distributed case then we reasoned it
>>> should work exactly the same way when both processes are on the same
>>> machine. Why is this? - because it would be very difficult to program
>>> if the primitives and mechanisms involved were completely different
>>> when the remote process lay on the same machine as the process it was
>>> talking to or on a different machine.
>>>
>>> It might be that we actually handle these two cases differently in the
>>> VM, but as far as the programmer is concerned the system should behave
>>> the same way, and the only observable change should
>>> concerned with latencies.
>>> </aside>
>>>
>>> To make the above possible we need a couple of primitives in Erlang,
>>> namely process_flag trap_exits, and spawn_link. That's all you need.
>>>
>>> The basic system properties you need to make a fault-tolerant systems
>>> are the abilities to detect remote errors and send and receive  
>>> messages.
>>>
>>> What the OTP supervisors etc. do are build layers of abstraction on
>>> top of spawn-link and trap_exits
>>> in order to simplify programming common use-cases.
>>>
>>> Fault detection and recovery is not arbitrary or magic in any sense,
>>> It has to be part of the system architecture like everything else. One
>>> common fault in making systems is to specify and design for the normal
>>> behavior
>>> but not specify and design how fault-detection and correction work.
>>> This is probably due to a legacy of
>>> programming in languages that have very limited ability to detect
>>> errors and recover from them.
>>>
>>> A C programmer with a single thread of control will take extreme
>>> measures to avoid crashing their program
>>> they have to,  there is only one thread of control. Given multiple
>>> threads of control and the ability to remotely
>>> detect errors your entire way of thinking changes and the emphasis
>>> changes from thinking "how to I avoid a crash"
>>> to thinking "given that a crash has occurred and been detected, how to
>>> I fix things up and carry on"
>>>
>>> The Erlang philosophy is that "things will crash", so we have to
>>> detect the crashes, fix the bit that broke and
>>> carry on - we do not make excessive efforts to avoid crashes, but we
>>> do try very hard to repair things after they have gone wrong.
>>>
>>> Fixing things after they have broken is often easier than preventing
>>> them from breaking. Preventing them
>>> from breaking is not theoretically possible - we can't even prove
>>> simple things, like that a program will terminate
>>> (the famous halting problem) - but we can easily detect failures and
>>> put the system to rights by applying
>>> certain invariants.
>>>
>>> OTP supervisors are just trees of processes with certain restart rules
>>> that experience has shown us
>>> to be useful.
>>>
>>> The main value of supervisors etc. (and of all the OTP behaviors) is
>>> in building large systems with many
>>> programmers involved. Rather than inventing their own patterns, the
>>> programmers reuse a common set of
>>> patterns, which makes the projects easier to manage.
>>>
>>> /Joe
>>>
>>>
>>>
>>>
>>> On Mon, Jan 3, 2011 at 6:21 AM, Edmond Begumisa
>>> <> wrote:
>>>>
>>>> Thanks for your response.
>>>>
>>>> Firstly, let me make my question a little clearer...
>>>>
>>>> To rephrase: For processes, "share nothing for the sake of  
>>>> concurrency" -
>>>> I
>>>> get, both in concept and application. "Share nothing for the sake of
>>>> fault-tolerance" - I get in concept but not in application.
>>>>
>>>> Yet as I understand it, it is for the latter reason Erlang shares
>>>> nothing*
>>>> and not the former. Interpretation: I must be completely missing the
>>>> point
>>>> in regards to Erlang processes and sharing nothing. This is what I  
>>>> want
>>>> to
>>>> understand in application. In addition to the "side" effect of sane
>>>> concurrency (which coming from a chaotic multi-threading shared-memory
>>>> world
>>>> I fully appreciate and practically make use of everyday), how can I  
>>>> also
>>>> make use of the "real" reason Erlang processes share nothing -- fault
>>>> tolerance? Practically/illustratively speaking?
>>>>
>>>> *ETS being the obvious exception.
>>>>
>>>> Secondly, here's a mantra from Joe Armstrong...
>>>>
>>>> @ minute 17:26
>>>> http://www.se-radio.net/2008/03/episode-89-joe-armstrong-on-erlang/
>>>>
>>>> "[message passing concurrency]... the original reasons have to do with
>>>> fault
>>>> tolerance... you have to copy all the data you need from computer 1 to
>>>> computer 2... if computer 1 crashes you take over on computer 2... you
>>>> can't
>>>> have dangling pointers... that's the reason for copying everything...
>>>> it's
>>>> got nothing to do with concurrency, it's got a lot to do with
>>>> fault-tolerance... if they don't crash you could just have a dangling
>>>> pointer and copy less data but it won't work in the presence of
>>>> errors..."
>>>>
>>>> I interpret this to mean that share-nothing between processes is more
>>>> about
>>>> replicating valid state than isolating corrupted state as you  
>>>> described.
>>>>
>>>> Indeed, Joe created an example on his blog...
>>>>
>>>>
>>>> http://armstrongonsoftware.blogspot.com/2007/07/scalable-fault-tolerant-upgradable.html
>>>>
>>>> It's algorithm 3 there I'm struggling with. Particularly where he  
>>>> says...
>>>>
>>>> "... In practise we would send an asynchronous stream of messages  
>>>> from N
>>>> to
>>>> N+1 containing enough information to recover if things go wrong."
>>>>
>>>> Unfortunately, I couldn't find part II to that post (I don't think  
>>>> there
>>>> is
>>>> one.) And I'm too green and inexperienced in the field of  
>>>> fault-tolerant
>>>> systems to figure it out on my own. I'm having trouble visualising the
>>>> practical here from the conceptual -- I need to be shown how :(
>>>>
>>>> Also, I seem to be under the impression that the Erlang language has  
>>>> some
>>>> sort of schematics to do this built-in (i.e. deal with one process  
>>>> taking
>>>> over from another if the first fails) and this is the reason processes
>>>> share
>>>> nothing. This seems to me to be something different from supervision
>>>> trees,
>>>> which use exit-trapping to re-spawn if a process fails with the active
>>>> 'job'
>>>> disappearing and any errors logged (like restarting a daemon). My
>>>> interpretation of the fault-tolerance Erlang is supposed to enable  
>>>> (for
>>>> those in the know) is seamless take-over. The 'job' lives on but
>>>> elsewhere.
>>>>
>>>> Using telecoms as an example: a phone call wouldn't be cut-off when a
>>>> fault
>>>> occurs, another node would seamlessly take over. This is how I
>>>> interpreted
>>>> Joe's post and other descriptions of Erlang's fault-tolerant features  
>>>> and
>>>> I
>>>> understand the key is in the share nothing policy for processes. I'm  
>>>> sure
>>>> I've mis-understood something or everything :)
>>>>
>>>> - Edmond -
>>>>
>>>> PS: I've read the Manning draft. Great book. I don't know if the  
>>>> answer
>>>> lies
>>>> in OTP (I searched and didn't find it). I suspect it's lower --  
>>>> probalby
>>>> how
>>>> you organise your processes. Some distributed-programming black-magic
>>>> only
>>>> Erlanger's know about :)
>>>>
>>>>
>>>> On Mon, 03 Jan 2011 14:56:51 +1100, Alain O'Dea <>
>>>> wrote:
>>>>
>>>>> On 2011-01-02, at 22:36, "Edmond Begumisa"  
>>>>> <>
>>>>> wrote:
>>>>>
>>>>>> Slight correction...
>>>>>>
>>>>>> On Mon, 03 Jan 2011 12:38:38 +1100, Edmond Begumisa
>>>>>> <> wrote:
>>>>>>
>>>>>>> Hello all,
>>>>>>>
>>>>>>> I've been trying to wrap my Erlang's fault tolerant features
>>>>>>> particularly in relation to processes.
>>>>>>>
>>>>>>
>>>>>> Should be: I've been trying to wrap my head around Erlang's fault
>>>>>> tolerant features particularly in relation to processes.
>>>>>>
>>>>>> Sorry.
>>>>>>
>>>>>>> I've heard/read repeatedly that the primary reason why Erlang's
>>>>>>> designers opted for a share-nothing policy is not rooted in
>>>>>>> concurrency but
>>>>>>> rather in fault-tolerance. When nothing is shared, everything is
>>>>>>> copied.
>>>>>>> When everything is copied processes can take over from one another
>>>>>>> when
>>>>>>> things fail. I follow this reasoning but I don't follow how to  
>>>>>>> apply
>>>>>>> it.
>>>>>>>
>>>>>>> I fully understand and appreciate how supervision trees are used to
>>>>>>> restart processes if they fail. What I don't get is what to do when
>>>>>>> you
>>>>>>> don't want to restart but want to take over, say on another node. I
>>>>>>> know
>>>>>>> that at a higher-level, OTP has some take-over/fail-over schematics
>>>>>>> (at the
>>>>>>> application level.) I'm trying to understand things at the  
>>>>>>> processes
>>>>>>> level -
>>>>>>> why Erlang is the way it is so I can better use it to make my
>>>>>>> currently
>>>>>>> fault-intolerant program fault tolerant.
>>>>>>>
>>>>>>> Specifically, how can one process take over from another if it  
>>>>>>> fails?
>>>>>>> It
>>>>>>> appears to may that the only way to do this would be to somehow
>>>>>>> retrieve not
>>>>>>> only the state of the process (say, gen_server's state) but also  
>>>>>>> the
>>>>>>> messages in its mailbox. Where does the design decision to
>>>>>>> share-nothing for
>>>>>>> the sake of fault-tolerance come into play for processes? Please  
>>>>>>> help
>>>>>>> me
>>>>>>> "get" this!
>>>>>>>
>>>>>>> Thanks in advance.
>>>>>>>
>>>>>>> - Edmond -
>>>>>>>
>>>>>>>
>>>>>
>>>>> Hi Edmond:
>>>>>
>>>>> Share-nothing helps with concurrent fault-tolerance by preventing one
>>>>> process from corrupting the state of another. Receive is a process'
>>>>> choice
>>>>> and it corrupts its own state if it receives bad data and lets it in.
>>>>>
>>>>> AFAIK OTP fault-tolerance doesn't mean no requests will fail, it  
>>>>> means
>>>>> the
>>>>> system/sub-system will recover if a single request causes a process  
>>>>> to
>>>>> crash.  It's kind of like proper try/catch recovery applied to
>>>>> concurrent
>>>>> code.  How you recover from the crash depends on the supervision
>>>>> strategy
>>>>> chosen.  In some cases the supervisor can pass the state to the
>>>>> replacement
>>>>> process. In others this isn't necessary or even desirable since the
>>>>> state
>>>>> itself may involve resources lost in the crash or corrupted state  
>>>>> that
>>>>> led
>>>>> to the crash.
>>>>>
>>>>> I am straying outside my knowledge here so this paragraph is  
>>>>> guesswork.
>>>>>  The message queue for a gen_server need not necessarily be lost when
>>>>> the
>>>>> callback module crashes.  In theory OTP could (and might already)  
>>>>> simply
>>>>> delegate the messages to the replacement process following a crash.
>>>>>  Someone
>>>>> who knows OTP better than me would need to weigh in here though.
>>>>>
>>>>> I found http://manning.com/logan very informative in understanding  
>>>>> OTP
>>>>> and
>>>>> its supervisor hierarchies.
>>>>>
>>>>> Cheers,
>>>>> Alain
>>>>> ________________________________________________________________
>>>>> erlang-questions (at) erlang.org mailing list.
>>>>> See http://www.erlang.org/faq.html
>>>>> To unsubscribe; mailto:
>>>>>
>>>>
>>>>
>>>> --
>>>> Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
>>>>
>>>> ________________________________________________________________
>>>> erlang-questions (at) erlang.org mailing list.
>>>> See http://www.erlang.org/faq.html
>>>> To unsubscribe; mailto:
>>>>
>>>>
>>>
>>> ________________________________________________________________
>>> erlang-questions (at) erlang.org mailing list.
>>> See http://www.erlang.org/faq.html
>>> To unsubscribe; mailto:
>>>
>>
>>
>> --
>> Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
>>
>
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:
>


-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/


More information about the erlang-questions mailing list