[erlang-questions] Please criticise these principles

Wed Aug 27 06:47:21 CEST 2008

I've just been looking at ISO/EIC DTR 13211-5:2007
"Prolog Multi-threading predicates".
This is a proposed addition to the ISO Prolog standard,
whose declared aim is "to promote the portability of
multi-threaded Prolog applications".

As I read through it, I felt sicker and sicker and
sicker.  If any of you are parents, you may know the
feeling when a child is being naughty and seems to be
going out of his/her way to do things that are OBVIOUSLY
to his/her detriment.

I've boiled my reactions down to "here is a short list
of design principles, every single one of which is
violated by the proposal."  Before sending them off to
the ISO Prolog crowd, I thought I'd ask the opinion of
Erlangers, especially Joe Armstrong, should he happen
to read this.

I concede that debugging tools may need to do all sorts
of things that are otherwise risky.  There are quite a
few predicates in the 'erlang' module that are labelled
as "debugging only".  What I'm talking about is *core*
facilities to be used in *normal* code that is meant to
be portable and reliable.

Some principles for simpler safer threading.
============================================

1.  No omniscient users.

     Users shall not be required to provide information,
     such as space allocation or tuning parameters, the
     values of which they cannot determine.

     Applicability:  consider a list cell.  In NU Prolog,
     a list cell holding a single character could be as
     little as 1 byte (on a 32-bit machine).  On some
     systems, a list cell could be 4 words, which on a
     64-bit system would be 32 bytes.  While it is
     imaginable that a user might know how many list cells
     a thread would need, it is not possible for them to
     say how many BYTES will be needed and if they did
     give a number it would not be portable.  This means
     that while a user *could* give an initial heap size
     for a thread, they could *not* give a *fixed* size
     that would suit all systems.

2.  No distinction between indistinguishables.

     A specification shall not mandate distinct responses
     to situations that user programs cannot distinguish.

     Application: sending a message to a process.
                  Case 1                    Case 2
          T+1     Pid is alive              Pid is alive
          T+2     Pid dies                  Pid ! Message
          T+3     Pid ! Message             Pid dies
     The ISO draft requires case 1 to produce a runtime
     exception in the send call.  In case 2, Pid dies
     and there is no send call to be blamed, so there is
     no such exception.

3.  No breaches of encapsulation.

     If process A wants process B to do something, it should
     ASK.  It should not FORCE process B to perform some action.

     Application 1:  the ISO draft includes an operation to
     kill any process.  There are mutexes.  There are global
     variables of a kind.  If you kill a process that is
     holding some mutexes, all those mutexes are released.
     This means that all the data protected by those mutexes
     is now in an unknown state and you dare not use it for
     the rest of the program's existence.

     Application 2:  the ISO draft includes an operation
     thread_signal(Thread, Goal) which causes Thread to be
     interrupted at the next opportunity and forced to call
     Goal.  The goal can do anything, including unlocking a
     mutex that the Thread is holding (and after the
     interrupt, mistakenly believes it is still holding).

4.  No unprotected shared mutable variables.

     While some thread has the power to write a variable,
     it is VERIFIED that no other thread has the power to
     read or write that variable.

     Application 1: Prolog has an analogue of Erlang process
     dictionaries, but it is global.  [More precisely, it is
     partitioned into named pieces each of while is local to
     a *module*, but the pieces are global to *threads.*]
     While SWI Prolog offers thread-local mutable data, the
     ISO draft includes no such thing.  It's as if Erlang
     offered only global ETS tables accessed without locks.
     While mutexes (but oddly, not reader/writer locks) are
     present in the ISO draft, there is no *intrinsic*
     connection between any mutable table and any mutex.

     Application 2: the draft introduces three kinds of IPC
     data:  thread IDs, mutex IDs, and message queue IDs.
     There are three name-spaces for 'aliases', rather like
     the Erlang registry for process ids.  These things are
     in effect mutable variables.  There are operations to
     create and destroy threads, mutexes, and message queues.
     Although there are no operations for rebinding aliases,
     this can happen:
	create a thingy and give it the alias 'fred'
	create a thread that refers to 'fred'
	destroy the thingy
	create another thingy and give it the alias 'fred'
     So the other thread *thinks* it knows what 'fred' refers
     to, but it is wrong.  As an example, there is a
     'thread_join(Thread, Result)' operation which waits for
     the Thread to complete and then picks up its Result; if
     Thread is an alias, this could wait for the wrong thread.

5.  No intrinsically unreliable information flows.

     There should be no query operations that give you
     information that you would have to be crazy to use.
     In particular, if you want some information about a
     thread, you should ASK it [so this may be a version
     of principle 3] and then you know that the information
     should be interpreted with reference to that specific
     synchronisation point.

     Application:  the ISO draft provides some operations
     of which it says "almost any usage of these ... is
     unsafe".  These relate to finding the 'instantaneous'
     state of IPC objects.  Because these are 'direct'
     queries that do not involve any explicit synchronisation,
     the point in the lifetime of the other thread that they
     refer to is entirely unknown.  You cannot expect these
     values to apply "now" (whatever that means) and you
     cannot tell at _what_ point in the past of the other
     thread they do relate to.

6.  No zombies.

     When a process dies, a death notice should be sent to
     its family and friends, if any, but the process itself
     should disappear completely.

     Application: because the thread_join/2 operation
     merely _exists_ in the interface, at least the full
     exit or exception status of a process must be kept
     around as long as there is a live copy of its Pid
     anywhere in a process or the global data base, in
     case someone should wait for it.  The term given to
     thread_exit/1 could be arbitrarily large.  There is
     no way to promise that you won't use thread_join.
     In effect this is a mandatory space leak.

While Erlang doesn't perfectly conform to these principles
(the process registry being a particularly painful example),
you can program *as if* it did.  And if you think I would
prefer multi-threading in Prolog to look as much as possible
like Erlang, why yes, I would.  I would like that very much.

What got me looking at this was someone asking me to review
a paper about how to implement thread_cancel/1, the operation
that kills any thread.  The paper claims that

	"The ability to cancel a thread is useful for
	application development and is critical to
	Prolog embeddability."

and I found myself saying "but the ability to cancel a
thread is like the ability to apply a chainsaw to your
own neck!  It's an incredibly easy way to violate
system integrity."

What's really frightening is that if I hadn't been exposed to
Erlang, my previous exposure to Ada and Occam and Concurrent
Pascal, nice though they are, might not have been enough to
stop me reading the DTR and going "yeah, this looks like a
fairly straightforward layer over pthreads, nice job" instead
of "yuck".  THANK YOU JOE!

--
If stupidity were a crime, who'd 'scape hanging?