[erlang-questions] Teaching Erlang as part of a paper -- advice sought

Tue Feb 9 09:36:35 CET 2010

Richard O'Keefe wrote:
> 
> On Feb 8, 2010, at 10:33 PM, Ulf Wiger wrote:
>> Kostis has alreay mentioned that data races are possible in Erlang.
> 
> I am dusting off my old 'pid_name' proposal from the 90s and turning
> it into an EEP.  I'd like to avoid races in the core language.

Some of the issues in the EEP were also dealt with in my gproc
paper (http://portal.acm.org/citation.cfm?id=1292522, also
http://svn.ulf.wiger.net/gproc/trunk/gproc/doc/erlang07-wiger.pdf).

I solved the race problem on registration by only allowing a
process to register itself. There is still a race in the sense
that two processes may contend for the same name, or that a
registered process may die before a "registered send" is delivered,
but this is, I think, an unavoidable part of the semantics of
referring to processes by name, combined with dynamic typing
and asynchronous message passing.

>> One possible benefit of using your approach is that it can highlight
>> that race conditions come in different forms. Some are part of the
>> problem domain (e.g. handshakes in peer-to-peer networks, credit
>> card transaction processing, session management in messaging systems
>> in general, etc.) while others are accidental - a consequence of the
>> chosen implementation technique. I would think that reasoning about
>> the high-level races provides an ample foundation for talking about
>> low-level data races.
> 
> I was proposing to have a couple of lectures about modelling systems
> in some sort of process calculus.  Because I know NuSMV, that's what
> I was planning to use.  I've always been very impressed by the work
> Thomas Arts and his colleagues did, but the latest version of EVT I
> can find is 2001, and since then, not only have Linux and Solaris
> and Java changed, so has Erlang changed.

On the topic of the process registry, there is also some recent
QuickCheck work that might be of interest.

Claessen, K., Palka, M., Smallbone, N., Hughes, J., Svensson, H., Arts, 
T., and Wiger, U. 2009. Finding race conditions in Erlang with 
QuickCheck and PULSE. SIGPLAN Not. 44, 9 (Aug. 2009), 149-160. DOI= 
http://doi.acm.org/10.1145/1631687.1596574

> Ulf, can you recommend anything I should look at to understand the
> issues in "(e.g. handshakes in peer-to-peer networks, credit
>> card transaction processing, session management in messaging systems
>> in general, etc.)"?

For peer-to-peer networks, the example that comes to mind
is DIAMETER (RFC 3588), but there may be other examples that
are less complex. One of the characteristics of peer-to-peer
is that you cannot know which side will initiate a connection
request - in this regard, the traditional telephony system is
obviously also a peer-to-peer network. Both sides must
therefore be prepared to be both "Initiator" and "Responder"
in a session setup (REF 3588, Ch 5.6).

The problem is somewhat compounded by the fact that the
Connection Establishment Request (CER) has to be preceded
by a TCP or SCTP connect to the peer's listen socket. This
already establishes a low-level session between the two
sides, and if both sides simultaneously connect, you will
have one TCP/SCTP session too many. This must be resolved.

On session management in messaging systems, this is related
to the above, but I was thinking specifically of the
handling of message sends in e.g. Jabber systems. There is
another QuickCheck paper - not yet published, I believe -
which deals with the problem that a message to a user should
be delivered immediately if the recipient is on-line, but
should be stored for later delivery otherwise. The problem
lies in determining when it is correct to deliver or store
respectively, considering that both the message delivery
and logout actions are asynchronous and take significant
time (as in non-zero amount of time).

In more traditional settings, the POTS (Plain Old Telephony
System) case includes the obviuos race condition where you
place a call to someone who is busy talking or placing a
call (perhaps even to you). If so, you should get a busy
signal. As the two end users are not synchronized in time
(except for the trivial case where you call yourseelf),
this amounts to a race condition, albeit one that is part
of the specification, and with a well-defined semantics.

The more subtle and somewhat eerie manifestation of this
is when you pick up the phone to call someone, and that
someone is already at the end of the line - but you didn't
hear it ring! The phone system vendors have had a long time
to polish the edges of their specifications...
Essentially, the system cancels out the transition states
and "fast-forwards" to the most practically useful stable
state.

The issue of credit card transactions is sort of a grey
area, as the 'specification' reflects the limitations
of current technology. If your transaction is on-line,
it is possible for the credit card company to perform
a real-time check of your balance. But if you are
off-line (or abroad, in which case the same tends to
apply), you are allowed to simply sign a slip. After
this, if you are disciplined enough to keep good
track of your spending, the balance in your head
differs from the balance in the credit card company
computers*, and subsequent real-time checks cannot
guarantee that you aren't already in overdraft.

* ...and if not, it's likely to differ more often than not.

One thing that makes Erlang a bit special compared to
most other languages is that it came about as the
result of extensive programming experiments on a
system where these types of races (and their big brother
"feature interaction") are commonplace, and inescapable.
So Erlang had to have a strategy for coping with the
event-state complexity explosion that is inevitable
if you pick the wrong approach. There are papers at
Ericsson (now surely only available as hard copy)
from the late 70s written by Mike Williams, Goran
Bage, Bjarne Dacker et al, outlining this.

>> BTW, which mutual exclusion techniques
>> can not be illustrated in Erlang?
> 
> None that I know of.  At least, no _high level_ ones.  If you want
> to demonstrate how, e.g., spin-locks _work_ at a low level, that's
> another matter.

AFAIR, there is a theorem similar to that of Turing
completeness that says that any complete paradigm for
mututal exclusion is sufficient to model any other complete
paradigm, but I've lost the reference.

But granted, it is more illustrative to experiment with
spin locks in C. But this can be covered in one lecture
and a subsequent lab. I recall that experience from a
course in embedded systems programming.

BR,
Ulf W
-- 
Ulf Wiger
CTO, Erlang Solutions Ltd, formerly Erlang Training & Consulting Ltd
http://www.erlang-solutions.com
---------------------------------------------------

---------------------------------------------------

WE'VE CHANGED NAMES!

Since January 1st 2010 Erlang Training and Consulting Ltd. has become ERLANG SOLUTIONS LTD.

www.erlang-solutions.com