[erlang-questions] Question about message passing paradigm

Wed Jul 2 02:20:57 CEST 2008

On 1 Jul 2008, at 7:37 pm, Edwin Fine wrote:
> Thank you for a very interesting and informative analysis. I must  
> admit that I tend to lump together Erlang/OTP as "Erlang" and see  
> solutions in that context. In a way, I feel as if using Erlang  
> "standalone" without OTP is very roughly analogous to using C++  
> without the STL: it can be done, but why on earth would one want to?

Because the STL
  - is still not fully portable between compilers;
    in theory it should not be, but it takes you into
    deep template territory where compilers have
    incompatible bugs (though they are improving)
  - is in my experience less efficient than home-brew
    code (the STL relies, like Haskell, on *serious*
    optimisation which compilers do not always do)
  - gets you some of the most incomprehensible
    compiler error messages (un)imaginable when you
    make mistakes, which you always do, because
  - it is unpleasantly complex to use (this will, I
    hope, be remedied in C0x, at least if I can
    trust Stroustrup's summary of what's going to be
    there).

None of these applies to OTP, but the argument that

  - it is another large and complex body of material
    to master on top of something itself unfamiliar.

does apply to both.  I have a colleague who has done
serious commercial software development in C++ and
flatly refuses to have anything to do with the STL
(see reasons above).

Let's take the current example and see if we can squeeze a
bit more out of it.  Tim Watson pointed out that there is
an issue about "locking" and failing processes.  My take
on this is "let it crash": when processes are coupled like
this they should be as a rule be linked and if one dies,
all should die.  The OTP behaviours take care of that kind
of thing, which is why *once you have grasped the basics*
you should try them before rolling your own.  What's really
interesting here is that the original system was written in
terms of threads and locking.  Have you looked at thread
interfaces lately?

[Single Unix Specification version 3]

pthread_mutex_lock(&mutex)
pthread_mutex_unlock(&mutex)

There are four kinds of mutex in POSIX threads:
   PTHREAD_MUTEX_NORMAL
	recursive locking deadlocks
	unlocking a mutex that is already unlocked
	or that is locked by another => undefined
   PTHREAD_MUTEX_ERRORCHECK
	these conditions return an error code
   PTHREAD_MUTEX_RECURSIVE
	recursive locking works
	unlocking a mutex that is already unlocked
	or that is locked by another => error code
   PTHREAD_MUTEX_DEFAULT
	same as PTHREAD_MUTEX_NORMAL
   Anything else
	All behaviour undefined

[Solaris 2.10]
If
(1) _POSIX_THREAD_PRIO_INHERIT is defined
(2) the mutex was initialised with protocol
     attribute PTHREAD_PRIO_INHERIT
(3) the mutex was initialised with robustness
     attribute PTHREAD_MUTEX_ROBUST_NP
(4) the last holder of the lock crashed
then
(A) an attempt to lock the mutex will 'fail'
     with the error code EOWNERDEAD
(B) but in fact the attempt will have succeeded
(C) it is up to the *new* owner of the lock to
     try to clean up the state
(D) if it can, it calls pthread_mutex_consistent_np
(E) if it crashes, the next locker will get the
     same error code and the same chance to recover
(F) if it can't, it should unlock the mutex, and
     future lockers will get a *different* error code
     (ENOTRECOVERABLE).
(G) it is possible to call pthread_mutex_consistent_np
     on mutexes that aren't held or didn't need
     recovery and the behaviour is undefined

If a mutex with the default PTHREAD_MUTEX_STALLED_NP
robustness value is held by a thread that dies,
future locks are "blocked in an unspecified manner".
What this means in practice I'm not sure.

If you reckon the Solaris 2.10 treatment of crashed
lock holders is a mess, perhaps you can point me to
something better in SUSv3.  I can't find anything in
SUSv3 to say _what_ happens when a lock owner crashes.
(And don't get me onto the subject of cancellation points.)

What kind of building block is _this_ for building
reliable systems?

Message passing plus linking is so much easier to
have justified confidence in that pthreads and TBB start
to look like extremely sick jokes.

> <off-topic>One last thing: I read the Ethics of Belief after poring  
> over one of your posts recently, and was exceptionally impressed  
> with the gentleman's writing and philosophy, with which I strongly  
> agree. In that regard, I'd like to misquote John Stuart Mill,  
> namely, "A foolish certainty is the hobgoblin of little minds".  
> (Actually, I think my version is a slight improvement ;) I am  
> unfortunately seeing the mechanism of insufficiently examined  
> beliefs at work today, resulting in the persecution of a friend of  
> mine by way of (the almost totally belief-based) Shaken Baby  
> Syndrome. So the essay really resonated deeply with me. I daresay  
> William K. Clifford would have had a lot to say about this. I wish  
> he were still alive to do so. Bertrand Russel too.</offtopic>

<also-off-topic>Clifford takes a really good idea and pushes it beyond  
the bounds
of reason; he manages, presumably without intending to, to make any  
belief in
science ethically unjustifiable.  Specifically and in Clifford's day  
topically,
Clifford's rule about believing other people meant that it would have  
been *bad*
for anyone to believe in Natural Selection.  I encountered Clifford's  
paper in
a book by DeMarco and someone else about risk management in software  
engineering.
I find the idea that Shaken Baby is still credited deeply upsetting;  
please convey
my sympathy and good wishes to your friend.
</also-off-topic>