[erlang-questions] What I dislike about Erlang
Richard O'Keefe
ok@REDACTED
Fri Aug 31 08:20:22 CEST 2012
We've just had a thread about what people like about Erlang.
We also had the announcement of TinyMQ.
So I'm going to use this as an example of what's *really*
wrong with Erlang.
Don't get me wrong. I endorse everything everyone else has
said in favour of Erlang. Erlang is like democracy: the worst
thing in its class except for all the others, and something
that is increasingly imitated by people who just don't get
some of the fundamental things about it.
I also endorse what people have said in praise of TinyMQ.
There are lots of things that it does right:
- there is a README
- there are EDoc comments with @specs for the public
interface
- the functions and variables are named well enough that
I was never in doubt about what any part of the code was
up to, at least not for longer than a second or two
- the hard work of process management is delegated to OTP
behaviours
At this point, it's looking better than anything I've written.
Make no mistake: I am not saying that Erlang or TinyMQ are *bad*.
They are good things; I'm just ranting somewhat vaguely about
why they should be better.
LUMPS OF INDISTINGUISHABLE CODE.
Up to a certain level of hand-waving, TinyMQ can be roughly
understood thus:
The TinyMQ *system* is a monitor
guarding a dictionary mapping strings to channnels,
where
a channel is a monitor
guarding a bag of subscribers and
a sliding window of {Message, Timestamp} pairs.
YOU CANNOT SEE THIS AT A GLANCE.
This is not Evan Miller's fault. *Anything* you write in
Erlang is going to end up as lumps of indistinguishable code,
because there is nothing else for it to be.
This is also true in C, C++, Java, C#, Javascript, Go,
Eiffel, Smalltalk, Prolog, Haskell, Clean, SML, ...,
not to mention Visual Basic and Fortran.
Almost the only languages I know where it doesn't *have* to
be true are Lisp, Scheme, and Lisp-Flavoured Erlang. Arguably
Prolog *could* be in this group, but in practice it usually is
in the other camp. Thanks to the preprocessor, C *can* be
made rather more scrutable, but for some reason this is frowned on.
There's the e2 project (http://e2project.org) which is a step
in a good direction, but it doesn't do much about this problem.
A version of TinyMQ using e2_service instead of gen_server
would in fact exacerbate the problem by mushing
handle_call/3, handle_cast/2, and handle_info/2 into one
function, turning three lumps into one bigger lump.
LUMPS OF DATA.
Take tinymq_channel_controller as an example.
Using an OTP behaviour means that all six dimensions of the state
are mushed together in one data structure. This goes a long way
towards hiding the fact that
supervisor, channel, and max_age are never changed
messages, subscribers, and last_pull *are* changed.
One teeny tiny step here would be to offer an alternative set of
callbacks for some behaviours where the "state" is separated into
immutable "context" and mutable "state", so that it is obvious
*by construction* that the context information *can't* be changed.
Another option would be to have some way of annotation in a
-record declaration that a field cannot be updated.
I prefer the segregation approach on the grounds of no language
change being needed and the improved efficiency of not copying
fields that can't have changed. Others might prefer the revise
-record approach on the grounds of not having to change or
duplicate the OTP behaviours.
I had to reach each file in detail
- to find that certain fields *happened* not to be changed
- to understand the design well enough to tell that this was
almost certainly deliberate.
WE DOCUMENT THE WRONG THINGS.
It's well known that there are two kinds of documentation,
"external" documentation for people writing clients of a module,
and "internal" documentation for people maintaining the module
itself. It's also well known that the division is simplistic;
if the external documentation is silent about material points
you have to read the internal documentation.
In languages like Prolog and Erlang and Scheme where you build
data structures out of existing "universal" types and have no
data structure declarations, we tend to document procedures
but not data. This is backwards. If you understand the data,
and especially its invariants, the code is often pretty obvious.
There are two examples of this in TinyMQ. One is specific to
TinyMQ. The other other is nearly universal in Erlang practice.
Erlang systems are made of lots of processes sending messages
to each other. Joe Armstrong has often said THINK ABOUT THE
PROTOCOLS. But Erlang programmers very seldom *write* about
the protocols.
Using the OTP behaviours, a "concurrent object" is implemented
as a module with a bunch of interface functions that forward
messages through the OTP layer to the callback code managed by
whatever behaviour it is. This protocol is unique to each kind
of concurrent object. It's often generated in one module (the
one with the interface functions) and consumed in another (the
one with the callback code), as it is in TinyMQ. And it's not
documented.
It is possible to reconstruct this protocol by reading the code
in detail and noting down what you see. It is troublesome when,
as in TinyMQ, the two modules disagree about the protocol. It's
clear that _something_ is wrong, but what, exactly?
For example, tinymq_controller has a case
handle_cast({set_max_age, newMaxAge}, State) ->
but this is the only occurrence of set_max_age anywhere in TinyMQ.
Is its presence in tinymq_controller an example of dead code,
or is its absence from the rest of the application an example
of missing code? The same question can be asked about 'expire'
(which would forget a channel without making it actually go away,
if it could ever be invoked, which it can't.)
Almost as soon as I started reading Erlang code many years ago
it seemed obvious to me that documenting (and if possible, type
checking) these internal protocols was a very important part of
Erlang internal documentation. There must be something wrong
with my brain, because other people don't seem to feel this lack
anywhere nearly as strongly as I do. I think Joe Armstrong sort
of sees this at the next level up or he would never have invented
UBF.
But Occam, Go, and Sing# have typed channels, so they *are*
addressing the issue, and *do* have a natural central point to
document what the alternatives of an internal protocol signify.
Another documentation failure is that we fail to document what
is not there. In TinyMQ, a channel automatically comes into
existence when you try to use it. Perhaps as a consequence of
this, there is no way to shut a channel down. In TinyMQ, old
messages are not removed from a channel when they expire, but
the next time someone does a 'subscribe' (waves hands) or a 'poll'
or a 'push' *after* they expire. So if processes stop sending
and requesting messages to some channel, the last few messages,
no matter how large, may hang around forever. I'm sure there
is a reason, but because it's a reason for something *not* being
there, there's no obvious place to hang the comment, and there
isn't one. (Except for the dead 'expire' clause mentioned above.)
IT'S HARD TO SPOT SALIENT DETAIL IN A SEA OF GLUE CODE.
The central fact about TinyMQ is that it holds the messages of
a channel in a simple list of {Message, Timestamp} pairs. As
a result, every operation on the data takes time linear in the
current size.
This is not stated anywhere in any comments nor in the README.
You have to read the code in detail to discover this. And it
is a rather nasty surprise. If a channel holds N messages,
the operations *can* be done in O(log(N)) time. (I believe it
is possible to do even better.) Some sliding window applications
have a bound on the number of elements in the window. This one
has a bound on the age of elements, but they could arrive at a
very high rate, so N *could* get large.
It is very easy to implement the necessary operations using lists,
so much so that they are present in several copies. Revising the
TinyMQ implementation to work better with long queues would be
harder than necessary because of this. And this goes un-noticed
because there is so much glue code for the guts to get lost in.
Given that Evan Miller took the trouble to use library components
for structuring this application, why didn't he take the next step,
and use the existing 'sliding window' library data structure?
Because there is none!
Yet sliding windows of one sort or another have come up before in
this mailing list. Perhaps we should have a Wiki page on
trapexit to gather requirements for one or more sliding window
libraries. Or perhaps not. "true religion jeans for women" --
what has that or "Cheap Nike Shoes" to do with Erlang/OTP
(http://www.trapexit.org/forum/viewforum.php?f=20)?
More information about the erlang-questions
mailing list