Defensive programming

Joe Armstrong (AL/EAB) joe.armstrong@REDACTED
Wed Mar 29 09:08:43 CEST 2006


> -----Original Message-----
> From: owner-erlang-questions@REDACTED 
> [mailto:owner-erlang-questions@REDACTED] On Behalf Of Pupeno
> Sent: den 29 mars 2006 00:37
> To: erlang-questions@REDACTED
> Subject: Defensive programming
> 
> Hello,
> I am used to defensive programming and it's hard for me to 
> program otherwise. 

You are getting the heart of the matter :-)


> Today I've found this piece of code I wrote some months ago:
> 
> acceptor(tcp, Module, LSocket) ->
>     case gen_tcp:accept(LSocket) of
>         {ok, Socket} ->
>             case Module:start() of   
>                 {ok, Pid} ->
>                     ok = gen_tcp:controlling_process(Socket, Pid),
>                     gen_server:cast(Pid, {connected, Socket}), 
>                     acceptor(tcp, Module, LSocket); 
>                 {error, Error} ->
>                     {stop, {Module, LSocket, Error}}
>             end;
> 	{error, Reason} -> 
> 	    {stop, {Module, LSocket, Reason}}
>     end;

14 lines of complex code - with a doubly indented case clause

> 
> is that too defensive ? should I write it this way
> 
> acceptor(tcp, Module, LSocket) ->
>     {ok, Socket} = case gen_tcp:accept(LSocket),
>     {ok, Pid} = Module:start()
>     ok = gen_tcp:controlling_process(Socket, Pid),
>     gen_server:cast(Pid, {connected, Socket}), 
>     acceptor(tcp, Module, LSocket);
> 

vs 6 lines of linear code with no conditional structure.

How can you ask the question - you KNOW the answer.

Six lines of linear code is far better that 14 lines of code with
conditional structure.

At some level your brain is crying out "the six line version is better"
- but this
is counter to everything you have ever learnt about "defensive
programming" - What you
learn about defensive programming came from the rule book for writing
sequential programs.
Now you need to unlearn this when writing Erlang programs.


I just say:

      "Let it crash"

To fully understand this statement you need to understand the underlying
philosophy of 
error handling in Erlang - and also what consequences this philosophy
has for how 
you actually write programs.

In what follows I will explain the philosophy and at the end of this 
show you some code that follows from the philosophy.


Error handling in Erlang is based on the idea of "workers" and
"observers"
where both workers and observers are processes.

      +-----------+                 +-----------+
      |   Worker  |------->---------|  Observer |
      +-----------+   error signal  +-----------+

Workers do the jobs - observers watch the workers. 
If a mistake occurs in the worker, they should just crash.
When they crash an error signal is sent to the observer.

The workers job is to the work and *nothing else*.
The observers job is to correct errors and *nothing else*.

This provides clean separation of issues.

Note this method of structuring cannot be done in a sequential language
- since
there is only one thread of control - thus in a sequential language all
error
handling MUST be done *within* the process itself.

That's why you have to program defensively in a sequential language -
you get one
thread of control and one chance to fix your error.

Why did things evolve this way? - the answer has to do with how we
program fault-tolerant
systems.

To make a fault tolerant system you need (at least) TWO computers (think
about it :-) - 
now suppose one computer crashes - how do you fix the fault - one the
other computer since
THERE IS NO ALTERNATIVE.

Now think "what are processes" - one way of thinking about processes is
to imagine them
as tiny little machines - if this is the case then the error handling
should be handled
in the same way as with real machines. Why? - you ask. So that there is
no semantic mismatch
when we model real-world behaviour as sets of processes.

If you have N independent things in the real world, you model them with
EXACTLY N Erlang processes
and you setup error observation channels exactly as they would occur in
the real world - as far
as possible your program should be isomorphic to the problem - that way
the code will virtually
"write itself" - deviation from this will lead to a mess.

Now the worker-observer error handling model is sufficiency for most
simple problems
but for complex problems we might imagine building a hierarchical tree
of workers and observers
where the observers themselves are observed by some other observers - a
management tree, as it were.

This generalisation is called a "supervision tree" and is one of the
standard behaviours in 
the OTP libraries.

Rather than learning how to use the supervision tree (which is overkill
for many small applications)
the best approach is to use the simplest form of error recovery.

A bit of code like this:

	observer(Fun) ->
	      process_flag(trap_exit, true),
		Pid = spawn_link(Fun),
         	receive
	         {'EXIT', Pid, Why} ->
		       io:format("worker died with:~p~n",[Why])
		end.

sets everything up. This process spawn_links Fun (ie spawns it with a
link) - the trap exit
is needed, because if you don't have it the watching process will die if
the spawned process dies. 

Now you write Fun *with no error handling* - you'll get a few error
messages, that are printed
out - decide which ones are recoverable and write a bit of error
correcting code and you are done.

Agggh - there's a gottya here. If you run observer(Fun) in the shell the
trap exit command will
effect the shell itself - also if the observer dies it might crash the
shell (I'm not sure
if the current shell spawns, or spawn_links, or applies the arguments it
is given.

In any case it is good practise not to assume anything about the shell.
So if we just define

	run(Fun) -> spawn(fun() -> observer(fun) end).

Then evaluating run(Fun) sets up a worker which evaluates Fun and an
observer which prints an
error if the worker dies.

That's all you need.

Type this code in run it and understand it.

Then write the worker with no messy error handling code - just "let it
crash".

Follow these simple rules and your code will be beautiful and easy to
understand.


   " -- Break any of these rules sooner than say anything outright
barbarous --"

	
George Orwell
	
Politics and the English Language" (1946) 

Cheers

/Joe





More information about the erlang-questions mailing list