[erlang-questions] What I dislike about Erlang

Fri Aug 31 08:20:22 CEST 2012

We've just had a thread about what people like about Erlang.
We also had the announcement of TinyMQ.
So I'm going to use this as an example of what's *really*
wrong with Erlang.

Don't get me wrong.  I endorse everything everyone else has
said in favour of Erlang.  Erlang is like democracy: the worst
thing in its class except for all the others, and something
that is increasingly imitated by people who just don't get
some of the fundamental things about it.

I also endorse what people have said in praise of TinyMQ.
There are lots of things that it does right:
 - there is a README
 - there are EDoc comments with @specs for the public
   interface
 - the functions and variables are named well enough that
   I was never in doubt about what any part of the code was
   up to, at least not for longer than a second or two
 - the hard work of process management is delegated to OTP
   behaviours
At this point, it's looking better than anything I've written.

Make no mistake: I am not saying that Erlang or TinyMQ are *bad*.
They are good things; I'm just ranting somewhat vaguely about
why they should be better.

LUMPS OF INDISTINGUISHABLE CODE.

  Up to a certain level of hand-waving, TinyMQ can be roughly
  understood thus:
	The TinyMQ *system* is a monitor
	guarding a dictionary mapping strings to channnels,
  where
	a channel is a monitor
	guarding a bag of subscribers and
	a sliding window of {Message, Timestamp} pairs.

  YOU CANNOT SEE THIS AT A GLANCE.

  This is not Evan Miller's fault.  *Anything* you write in
  Erlang is going to end up as lumps of indistinguishable code,
  because there is nothing else for it to be.

  This is also true in C, C++, Java, C#, Javascript, Go,
  Eiffel, Smalltalk, Prolog, Haskell, Clean, SML, ...,
  not to mention Visual Basic and Fortran.

  Almost the only languages I know where it doesn't *have* to
  be true are Lisp, Scheme, and Lisp-Flavoured Erlang.  Arguably
  Prolog *could* be in this group, but in practice it usually is
  in the other camp.  Thanks to the preprocessor, C *can* be
  made rather more scrutable, but for some reason this is frowned on.

  There's the e2 project (http://e2project.org) which is a step
  in a good direction, but it doesn't do much about this problem.
  A version of TinyMQ using e2_service instead of gen_server
  would in fact exacerbate the problem by mushing
  handle_call/3, handle_cast/2, and handle_info/2 into one
  function, turning three lumps into one bigger lump.

LUMPS OF DATA.

  Take tinymq_channel_controller as an example.
  Using an OTP behaviour means that all six dimensions of the state
  are mushed together in one data structure.  This goes a long way
  towards hiding the fact that

	supervisor, channel, and max_age are never changed
	messages, subscribers, and last_pull *are* changed.

  One teeny tiny step here would be to offer an alternative set of
  callbacks for some behaviours where the "state" is separated into
  immutable "context" and mutable "state", so that it is obvious
  *by construction* that the context information *can't* be changed.

  Another option would be to have some way of annotation in a
  -record declaration that a field cannot be updated.

  I prefer the segregation approach on the grounds of no language
  change being needed and the improved efficiency of not copying
  fields that can't have changed.  Others might prefer the revise
  -record approach on the grounds of not having to change or
  duplicate the OTP behaviours.

  I had to reach each file in detail
  - to find that certain fields *happened* not to be changed
  - to understand the design well enough to tell that this was
    almost certainly deliberate.

WE DOCUMENT THE WRONG THINGS.

  It's well known that there are two kinds of documentation,
  "external" documentation for people writing clients of a module,
  and "internal" documentation for people maintaining the module
  itself.  It's also well known that the division is simplistic;
  if the external documentation is silent about material points
  you have to read the internal documentation.

  In languages like Prolog and Erlang and Scheme where you build
  data structures out of existing "universal" types and have no
  data structure declarations, we tend to document procedures
  but not data.  This is backwards.  If you understand the data,
  and especially its invariants, the code is often pretty obvious.

  There are two examples of this in TinyMQ.  One is specific to
  TinyMQ.  The other other is nearly universal in Erlang practice.

  Erlang systems are made of lots of processes sending messages
  to each other.  Joe Armstrong has often said THINK ABOUT THE
  PROTOCOLS.  But Erlang programmers very seldom *write* about
  the protocols.

  Using the OTP behaviours, a "concurrent object" is implemented
  as a module with a bunch of interface functions that forward
  messages through the OTP layer to the callback code managed by
  whatever behaviour it is.  This protocol is unique to each kind
  of concurrent object.  It's often generated in one module (the
  one with the interface functions) and consumed in another (the
  one with the callback code), as it is in TinyMQ.  And it's not
  documented.

  It is possible to reconstruct this protocol by reading the code
  in detail and noting down what you see.  It is troublesome when,
  as in TinyMQ, the two modules disagree about the protocol.  It's
  clear that _something_ is wrong, but what, exactly?

  For example, tinymq_controller has a case
	handle_cast({set_max_age, newMaxAge}, State) ->
  but this is the only occurrence of set_max_age anywhere in TinyMQ.
  Is its presence in tinymq_controller an example of dead code,
  or is its absence from the rest of the application an example
  of missing code?  The same question can be asked about 'expire'
  (which would forget a channel without making it actually go away,
   if it could ever be invoked, which it can't.)

  Almost as soon as I started reading Erlang code many years ago
  it seemed obvious to me that documenting (and if possible, type
  checking) these internal protocols was a very important part of
  Erlang internal documentation.  There must be something wrong
  with my brain, because other people don't seem to feel this lack
  anywhere nearly as strongly as I do.  I think Joe Armstrong sort
  of sees this at the next level up or he would never have invented
  UBF.

  But Occam, Go, and Sing# have typed channels, so they *are*
  addressing the issue, and *do* have a natural central point to
  document what the alternatives of an internal protocol signify.

  Another documentation failure is that we fail to document what
  is not there.  In TinyMQ, a channel automatically comes into
  existence when you try to use it.  Perhaps as a consequence of
  this, there is no way to shut a channel down.  In TinyMQ, old
  messages are not removed from a channel when they expire, but
  the next time someone does a 'subscribe' (waves hands) or a 'poll'
  or a 'push' *after* they expire.  So if processes stop sending
  and requesting messages to some channel, the last few messages,
  no matter how large, may hang around forever.  I'm sure there
  is a reason, but because it's a reason for something *not* being
  there, there's no obvious place to hang the comment, and there
  isn't one.  (Except for the dead 'expire' clause mentioned above.)

IT'S HARD TO SPOT SALIENT DETAIL IN A SEA OF GLUE CODE.

  The central fact about TinyMQ is that it holds the messages of
  a channel in a simple list of {Message, Timestamp} pairs.  As
  a result, every operation on the data takes time linear in the
  current size.

  This is not stated anywhere in any comments nor in the README.
  You have to read the code in detail to discover this.  And it
  is a rather nasty surprise.  If a channel holds N messages,
  the operations *can* be done in O(log(N)) time.  (I believe it
  is possible to do even better.)  Some sliding window applications
  have a bound on the number of elements in the window.  This one
  has a bound on the age of elements, but they could arrive at a
  very high rate, so N *could* get large.

  It is very easy to implement the necessary operations using lists,
  so much so that they are present in several copies.  Revising the
  TinyMQ implementation to work better with long queues would be
  harder than necessary because of this.  And this goes un-noticed
  because there is so much glue code for the guts to get lost in.

  Given that Evan Miller took the trouble to use library components
  for structuring this application, why didn't he take the next step,
  and use the existing 'sliding window' library data structure?

	Because there is none!

  Yet sliding windows of one sort or another have come up before in
  this mailing list.  Perhaps we should have a Wiki page on
  trapexit to gather requirements for one or more sliding window
  libraries.  Or perhaps not.  "true religion jeans for women" --
  what has that or "Cheap Nike Shoes" to do with Erlang/OTP
  (http://www.trapexit.org/forum/viewforum.php?f=20)?