[erlang-questions] What I dislike about Erlang

Fri Aug 31 19:42:17 CEST 2012

Richard,

Thanks for your comments. To preface, I plead guilty to charges of
gross negligence in failing to document TinyMQ's internals. This was
laziness on my part.

I released TinyMQ only because I felt guilty for sitting on the code
for about a year. Like many open-source programmers, I have a lot of
demands on my attention, and it is not clear in advance what
documentation is actually worth writing. The @spec and @doc strings
for the public API seemed like a good start. But if it turned out that
no one was interested in using the library in the first place, why
should I bother documenting internal protocols and data structures?
I've wasted many hours in the past documenting, refactoring, and
generally cleaning up application internals for the benefit of
nebulous "others", only to receive zero patches and no indication that
any of my efforts were of any assistance to anyone.

So in the spirit of your capitalized complaints, I will just say:

ALL YOU HAVE TO DO IS ASK

Want to know about the big-O performance characteristics? Just ask.
Want to know how channel creation works? Just ask. As a lazy person,
if a few people ask me the same thing I'll usually add a note to the
README in order to avert future emails from strangers. We all like a
well-documented project, but without feedback and communication it is
not clear where one's efforts are best spent on a project that doesn't
have an explicit client. If I knew in advance who would be using and
reading the code (i.e. if I wrote this code for an employer), I would
put more effort into writing documents for that specific audience. But
as a rule, if I am just putting some code "out there", I would rather
wait and see what people would like to know about, rather than
pre-emptively document every thought that has ever occurred to me
relating to the code base.

Now, I know you were not trying to pick on TinyMQ, and your interest
is more in how Erlang tends to result in lumps of code that obscure
key characteristics of the application. I agree with the assessment,
but I am not quite as hopeless about the situation.

I would like to see the development of graphical tools that let you
see in an instant how applications are structured and how they behave.
I am thinking of something like Pman on steroids, where I can *watch*
messages travel between processes, *inspect* gen_server state, and
*test* the system by seeing the result of single function calls or
many (load-testing). I'd like to be able to do all this with my mouse,
and generally get the feeling that I am watching the operation of a
machine that *shows* me how messages are passed, processes are
created, and state is updated.

Did anyone else ever play Marble Drop from Maxis in the late 90s? That
is the kind of interface I would like to see for the Erlang run-time.

For now, I'll update the README.

Evan

On Fri, Aug 31, 2012 at 1:20 AM, Richard O'Keefe <ok@REDACTED> wrote:
> We've just had a thread about what people like about Erlang.
> We also had the announcement of TinyMQ.
> So I'm going to use this as an example of what's *really*
> wrong with Erlang.
>
> Don't get me wrong.  I endorse everything everyone else has
> said in favour of Erlang.  Erlang is like democracy: the worst
> thing in its class except for all the others, and something
> that is increasingly imitated by people who just don't get
> some of the fundamental things about it.
>
> I also endorse what people have said in praise of TinyMQ.
> There are lots of things that it does right:
>  - there is a README
>  - there are EDoc comments with @specs for the public
>    interface
>  - the functions and variables are named well enough that
>    I was never in doubt about what any part of the code was
>    up to, at least not for longer than a second or two
>  - the hard work of process management is delegated to OTP
>    behaviours
> At this point, it's looking better than anything I've written.
>
> Make no mistake: I am not saying that Erlang or TinyMQ are *bad*.
> They are good things; I'm just ranting somewhat vaguely about
> why they should be better.
>
>
> LUMPS OF INDISTINGUISHABLE CODE.
>
>   Up to a certain level of hand-waving, TinyMQ can be roughly
>   understood thus:
>         The TinyMQ *system* is a monitor
>         guarding a dictionary mapping strings to channnels,
>   where
>         a channel is a monitor
>         guarding a bag of subscribers and
>         a sliding window of {Message, Timestamp} pairs.
>
>   YOU CANNOT SEE THIS AT A GLANCE.
>
>   This is not Evan Miller's fault.  *Anything* you write in
>   Erlang is going to end up as lumps of indistinguishable code,
>   because there is nothing else for it to be.
>
>   This is also true in C, C++, Java, C#, Javascript, Go,
>   Eiffel, Smalltalk, Prolog, Haskell, Clean, SML, ...,
>   not to mention Visual Basic and Fortran.
>
>   Almost the only languages I know where it doesn't *have* to
>   be true are Lisp, Scheme, and Lisp-Flavoured Erlang.  Arguably
>   Prolog *could* be in this group, but in practice it usually is
>   in the other camp.  Thanks to the preprocessor, C *can* be
>   made rather more scrutable, but for some reason this is frowned on.
>
>   There's the e2 project (http://e2project.org) which is a step
>   in a good direction, but it doesn't do much about this problem.
>   A version of TinyMQ using e2_service instead of gen_server
>   would in fact exacerbate the problem by mushing
>   handle_call/3, handle_cast/2, and handle_info/2 into one
>   function, turning three lumps into one bigger lump.
>
> LUMPS OF DATA.
>
>   Take tinymq_channel_controller as an example.
>   Using an OTP behaviour means that all six dimensions of the state
>   are mushed together in one data structure.  This goes a long way
>   towards hiding the fact that
>
>         supervisor, channel, and max_age are never changed
>         messages, subscribers, and last_pull *are* changed.
>
>   One teeny tiny step here would be to offer an alternative set of
>   callbacks for some behaviours where the "state" is separated into
>   immutable "context" and mutable "state", so that it is obvious
>   *by construction* that the context information *can't* be changed.
>
>   Another option would be to have some way of annotation in a
>   -record declaration that a field cannot be updated.
>
>   I prefer the segregation approach on the grounds of no language
>   change being needed and the improved efficiency of not copying
>   fields that can't have changed.  Others might prefer the revise
>   -record approach on the grounds of not having to change or
>   duplicate the OTP behaviours.
>
>   I had to reach each file in detail
>   - to find that certain fields *happened* not to be changed
>   - to understand the design well enough to tell that this was
>     almost certainly deliberate.
>
> WE DOCUMENT THE WRONG THINGS.
>
>   It's well known that there are two kinds of documentation,
>   "external" documentation for people writing clients of a module,
>   and "internal" documentation for people maintaining the module
>   itself.  It's also well known that the division is simplistic;
>   if the external documentation is silent about material points
>   you have to read the internal documentation.
>
>   In languages like Prolog and Erlang and Scheme where you build
>   data structures out of existing "universal" types and have no
>   data structure declarations, we tend to document procedures
>   but not data.  This is backwards.  If you understand the data,
>   and especially its invariants, the code is often pretty obvious.
>
>   There are two examples of this in TinyMQ.  One is specific to
>   TinyMQ.  The other other is nearly universal in Erlang practice.
>
>   Erlang systems are made of lots of processes sending messages
>   to each other.  Joe Armstrong has often said THINK ABOUT THE
>   PROTOCOLS.  But Erlang programmers very seldom *write* about
>   the protocols.
>
>   Using the OTP behaviours, a "concurrent object" is implemented
>   as a module with a bunch of interface functions that forward
>   messages through the OTP layer to the callback code managed by
>   whatever behaviour it is.  This protocol is unique to each kind
>   of concurrent object.  It's often generated in one module (the
>   one with the interface functions) and consumed in another (the
>   one with the callback code), as it is in TinyMQ.  And it's not
>   documented.
>
>   It is possible to reconstruct this protocol by reading the code
>   in detail and noting down what you see.  It is troublesome when,
>   as in TinyMQ, the two modules disagree about the protocol.  It's
>   clear that _something_ is wrong, but what, exactly?
>
>   For example, tinymq_controller has a case
>         handle_cast({set_max_age, newMaxAge}, State) ->
>   but this is the only occurrence of set_max_age anywhere in TinyMQ.
>   Is its presence in tinymq_controller an example of dead code,
>   or is its absence from the rest of the application an example
>   of missing code?  The same question can be asked about 'expire'
>   (which would forget a channel without making it actually go away,
>    if it could ever be invoked, which it can't.)
>
>   Almost as soon as I started reading Erlang code many years ago
>   it seemed obvious to me that documenting (and if possible, type
>   checking) these internal protocols was a very important part of
>   Erlang internal documentation.  There must be something wrong
>   with my brain, because other people don't seem to feel this lack
>   anywhere nearly as strongly as I do.  I think Joe Armstrong sort
>   of sees this at the next level up or he would never have invented
>   UBF.
>
>   But Occam, Go, and Sing# have typed channels, so they *are*
>   addressing the issue, and *do* have a natural central point to
>   document what the alternatives of an internal protocol signify.
>
>   Another documentation failure is that we fail to document what
>   is not there.  In TinyMQ, a channel automatically comes into
>   existence when you try to use it.  Perhaps as a consequence of
>   this, there is no way to shut a channel down.  In TinyMQ, old
>   messages are not removed from a channel when they expire, but
>   the next time someone does a 'subscribe' (waves hands) or a 'poll'
>   or a 'push' *after* they expire.  So if processes stop sending
>   and requesting messages to some channel, the last few messages,
>   no matter how large, may hang around forever.  I'm sure there
>   is a reason, but because it's a reason for something *not* being
>   there, there's no obvious place to hang the comment, and there
>   isn't one.  (Except for the dead 'expire' clause mentioned above.)
>
> IT'S HARD TO SPOT SALIENT DETAIL IN A SEA OF GLUE CODE.
>
>   The central fact about TinyMQ is that it holds the messages of
>   a channel in a simple list of {Message, Timestamp} pairs.  As
>   a result, every operation on the data takes time linear in the
>   current size.
>
>   This is not stated anywhere in any comments nor in the README.
>   You have to read the code in detail to discover this.  And it
>   is a rather nasty surprise.  If a channel holds N messages,
>   the operations *can* be done in O(log(N)) time.  (I believe it
>   is possible to do even better.)  Some sliding window applications
>   have a bound on the number of elements in the window.  This one
>   has a bound on the age of elements, but they could arrive at a
>   very high rate, so N *could* get large.
>
>   It is very easy to implement the necessary operations using lists,
>   so much so that they are present in several copies.  Revising the
>   TinyMQ implementation to work better with long queues would be
>   harder than necessary because of this.  And this goes un-noticed
>   because there is so much glue code for the guts to get lost in.
>
>   Given that Evan Miller took the trouble to use library components
>   for structuring this application, why didn't he take the next step,
>   and use the existing 'sliding window' library data structure?
>
>         Because there is none!
>
>   Yet sliding windows of one sort or another have come up before in
>   this mailing list.  Perhaps we should have a Wiki page on
>   trapexit to gather requirements for one or more sliding window
>   libraries.  Or perhaps not.  "true religion jeans for women" --
>   what has that or "Cheap Nike Shoes" to do with Erlang/OTP
>   (http://www.trapexit.org/forum/viewforum.php?f=20)?
>
>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions

-- 
Evan Miller
http://www.evanmiller.org/