[erlang-questions] Please criticise these principles
Joe Armstrong
erlang@REDACTED
Wed Aug 27 10:54:35 CEST 2008
Hi Richard I've added my comments in-line
/Joe
On Wed, Aug 27, 2008 at 6:47 AM, Richard A. O'Keefe <ok@REDACTED> wrote:
> ...
nt.
>
> I've boiled my reactions down to "here is a short list
> of design principles, every single one of which is
> violated by the proposal." Before sending them off to
> the ISO Prolog crowd, I thought I'd ask the opinion of
> Erlangers, especially Joe Armstrong, should he happen
> to read this.
I'm reading with bated breath. (btw I still love Prolog - Robert Virding has
implemented a Prolog in Erlang with run Nrev at 100 KLips (on the nrev
benchmark)
I've found the old Erlang in prolog emulator, so soon we should be able to
implement erlang-in-prolog-in-erlang :-)
>
> I concede that debugging tools may need to do all sorts
> of things that are otherwise risky. There are quite a
> few predicates in the 'erlang' module that are labelled
> as "debugging only". What I'm talking about is *core*
> facilities to be used in *normal* code that is meant to
> be portable and reliable.
>
Absolutly - things like erlang:display/1 are *very* useful for debugging
the I/O code but should never be used in production programs.
>
>
> Some principles for simpler safer threading.
> ============================================
>
>
> 1. No omniscient users.
>
> Users shall not be required to provide information,
> such as space allocation or tuning parameters, the
> values of which they cannot determine.
>
> Applicability: consider a list cell. In NU Prolog,
> a list cell holding a single character could be as
> little as 1 byte (on a 32-bit machine). On some
> systems, a list cell could be 4 words, which on a
> 64-bit system would be 32 bytes. While it is
> imaginable that a user might know how many list cells
> a thread would need, it is not possible for them to
> say how many BYTES will be needed and if they did
> give a number it would not be portable. This means
> that while a user *could* give an initial heap size
> for a thread, they could *not* give a *fixed* size
> that would suit all systems.
Agree
> 2. No distinction between indistinguishables.
>
> A specification shall not mandate distinct responses
> to situations that user programs cannot distinguish.
>
> Application: sending a message to a process.
> Case 1 Case 2
> T+1 Pid is alive Pid is alive
> T+2 Pid dies Pid ! Message
> T+3 Pid ! Message Pid dies
> The ISO draft requires case 1 to produce a runtime
> exception in the send call. In case 2, Pid dies
> and there is no send call to be blamed, so there is
> no such exception.
Impossible. Suppose Pid is on a remote machine. You cannot distiguish
communication failure, with machine failure. So you cannot implement this.
The Erlang Tao says if you want to know if a message was received then
send a reply
and wait for it.
Even if you send a message and it is received there is no guarantee of
"liveness"
the receiver might receive the message and go into an infinite loop.
This is why we invented the link mechanism.
> 3. No breaches of encapsulation.
>
> If process A wants process B to do something, it should
> ASK. It should not FORCE process B to perform some action.
Yes
> Application 1: the ISO draft includes an operation to
> kill any process. There are mutexes. There are global
> variables of a kind. If you kill a process that is
> holding some mutexes, all those mutexes are released.
> This means that all the data protected by those mutexes
> is now in an unknown state and you dare not use it for
> the rest of the program's existence.
Madness
>
> Application 2: the ISO draft includes an operation
> thread_signal(Thread, Goal) which causes Thread to be
> interrupted at the next opportunity and forced to call
> Goal. The goal can do anything, including unlocking a
> mutex that the Thread is holding (and after the
> interrupt, mistakenly believes it is still holding).
Daft
> 4. No unprotected shared mutable variables.
>
> While some thread has the power to write a variable,
> it is VERIFIED that no other thread has the power to
> read or write that variable.
I have always thought that there should not be shared variables AT ALL.
You can't actually share a variable, it's a violation of causality - think
rays of light, electrons running along wires, think time, failures.
The problem is not the mutexes, it's the shared state. Since you shouldn't
have shared state, then you shouldn't need mutexes to protect the shared state.
Erlang programmers have happlily been writing distributed and
concurrent programs
for twenty years without the use of mutexes - they are just NOT needed.
Now deep under the covers, where no user should be lurking, there are some
ets tables - *which were added to implement large data bases* - (solving the
"we can't copy the entire universe problem" - the correct way to use ets tables
is not to use them - but use them via the mnesia transactions (a form
of transaction memory) - if you really know what you are doing you can
disregard this advice.
>
> Application 1: Prolog has an analogue of Erlang process
> dictionaries, but it is global. [More precisely, it is
> partitioned into named pieces each of while is local to
> a *module*, but the pieces are global to *threads.*]
> While SWI Prolog offers thread-local mutable data, the
> ISO draft includes no such thing. It's as if Erlang
> offered only global ETS tables accessed without locks.
> While mutexes (but oddly, not reader/writer locks) are
> present in the ISO draft, there is no *intrinsic*
> connection between any mutable table and any mutex.
This will lead to many horrific errors.
> Application 2: the draft introduces three kinds of IPC
> data: thread IDs, mutex IDs, and message queue IDs.
> There are three name-spaces for 'aliases', rather like
> the Erlang registry for process ids. These things are
> in effect mutable variables. There are operations to
> create and destroy threads, mutexes, and message queues.
> Although there are no operations for rebinding aliases,
> this can happen:
> create a thingy and give it the alias 'fred'
> create a thread that refers to 'fred'
> destroy the thingy
> create another thingy and give it the alias 'fred'
> So the other thread *thinks* it knows what 'fred' refers
> to, but it is wrong. As an example, there is a
> 'thread_join(Thread, Result)' operation which waits for
> the Thread to complete and then picks up its Result; if
> Thread is an alias, this could wait for the wrong thread.
Not good
>
> 5. No intrinsically unreliable information flows.
>
> There should be no query operations that give you
> information that you would have to be crazy to use.
> In particular, if you want some information about a
> thread, you should ASK it [so this may be a version
> of principle 3] and then you know that the information
> should be interpreted with reference to that specific
> synchronisation point.
yes
> Application: the ISO draft provides some operations
> of which it says "almost any usage of these ... is
> unsafe". These relate to finding the 'instantaneous'
> state of IPC objects. Because these are 'direct'
> queries that do not involve any explicit synchronisation,
> the point in the lifetime of the other thread that they
> refer to is entirely unknown. You cannot expect these
> values to apply "now" (whatever that means) and you
> cannot tell at _what_ point in the past of the other> thread they do relate to.
UUgh - impossible - there is no "instantaneous" state of a remote object.
Light takes finite time to propagate through the ether. Think special
relativity -
I guess the ISO standards committee members are not ex physicists :-)
>
> 6. No zombies.
>
> When a process dies, a death notice should be sent to
> its family and friends, if any, but the process itself
> should disappear completely.
yes
>
> Application: because the thread_join/2 operation
> merely _exists_ in the interface, at least the full
> exit or exception status of a process must be kept
> around as long as there is a live copy of its Pid
> anywhere in a process or the global data base, in
> case someone should wait for it. The term given to
> thread_exit/1 could be arbitrarily large. There is
> no way to promise that you won't use thread_join.
> In effect this is a mandatory space leak.
>
>
> While Erlang doesn't perfectly conform to these principles
> (the process registry being a particularly painful example),
> you can program *as if* it did. And if you think I would
> prefer multi-threading in Prolog to look as much as possible
> like Erlang, why yes, I would. I would like that very much.
>
> What got me looking at this was someone asking me to review
> a paper about how to implement thread_cancel/1, the operation
> that kills any thread. The paper claims that
>
> "The ability to cancel a thread is useful for
> application development and is critical to
> Prolog embeddability."
>
> and I found myself saying "but the ability to cancel a
> thread is like the ability to apply a chainsaw to your
> own neck! It's an incredibly easy way to violate
> system integrity."
>
> What's really frightening is that if I hadn't been exposed to
> Erlang, my previous exposure to Ada and Occam and Concurrent
> Pascal, nice though they are, might not have been enough to
> stop me reading the DTR and going "yeah, this looks like a
> fairly straightforward layer over pthreads, nice job" instead
> of "yuck". THANK YOU JOE!
Concurrency isn't a "nice layer over pthreads" - the most important thing
is isolation - anything that mucks up isolation is a mistake.
If my computer crashes (the one I'm typing on NOW) crashes I hope that
this will not crash your computer. So it should be with threads.
/Joe
> --
> If stupidity were a crime, who'd 'scape hanging?
>
>
>
>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>
More information about the erlang-questions
mailing list