[erlang-questions] What I dislike about Erlang

Fri Aug 31 09:43:41 CEST 2012

My biggest gripe with erlang are the limitations of records. Anyone know
when frames will make an appearance?

Sergej

On Fri, Aug 31, 2012 at 8:20 AM, Richard O'Keefe <ok@REDACTED> wrote:

> We've just had a thread about what people like about Erlang.
> We also had the announcement of TinyMQ.
> So I'm going to use this as an example of what's *really*
> wrong with Erlang.
>
> Don't get me wrong.  I endorse everything everyone else has
> said in favour of Erlang.  Erlang is like democracy: the worst
> thing in its class except for all the others, and something
> that is increasingly imitated by people who just don't get
> some of the fundamental things about it.
>
> I also endorse what people have said in praise of TinyMQ.
> There are lots of things that it does right:
>  - there is a README
>  - there are EDoc comments with @specs for the public
>    interface
>  - the functions and variables are named well enough that
>    I was never in doubt about what any part of the code was
>    up to, at least not for longer than a second or two
>  - the hard work of process management is delegated to OTP
>    behaviours
> At this point, it's looking better than anything I've written.
>
> Make no mistake: I am not saying that Erlang or TinyMQ are *bad*.
> They are good things; I'm just ranting somewhat vaguely about
> why they should be better.
>
>
> LUMPS OF INDISTINGUISHABLE CODE.
>
>   Up to a certain level of hand-waving, TinyMQ can be roughly
>   understood thus:
>         The TinyMQ *system* is a monitor
>         guarding a dictionary mapping strings to channnels,
>   where
>         a channel is a monitor
>         guarding a bag of subscribers and
>         a sliding window of {Message, Timestamp} pairs.
>
>   YOU CANNOT SEE THIS AT A GLANCE.
>
>   This is not Evan Miller's fault.  *Anything* you write in
>   Erlang is going to end up as lumps of indistinguishable code,
>   because there is nothing else for it to be.
>
>   This is also true in C, C++, Java, C#, Javascript, Go,
>   Eiffel, Smalltalk, Prolog, Haskell, Clean, SML, ...,
>   not to mention Visual Basic and Fortran.
>
>   Almost the only languages I know where it doesn't *have* to
>   be true are Lisp, Scheme, and Lisp-Flavoured Erlang.  Arguably
>   Prolog *could* be in this group, but in practice it usually is
>   in the other camp.  Thanks to the preprocessor, C *can* be
>   made rather more scrutable, but for some reason this is frowned on.
>
>   There's the e2 project (http://e2project.org) which is a step
>   in a good direction, but it doesn't do much about this problem.
>   A version of TinyMQ using e2_service instead of gen_server
>   would in fact exacerbate the problem by mushing
>   handle_call/3, handle_cast/2, and handle_info/2 into one
>   function, turning three lumps into one bigger lump.
>
> LUMPS OF DATA.
>
>   Take tinymq_channel_controller as an example.
>   Using an OTP behaviour means that all six dimensions of the state
>   are mushed together in one data structure.  This goes a long way
>   towards hiding the fact that
>
>         supervisor, channel, and max_age are never changed
>         messages, subscribers, and last_pull *are* changed.
>
>   One teeny tiny step here would be to offer an alternative set of
>   callbacks for some behaviours where the "state" is separated into
>   immutable "context" and mutable "state", so that it is obvious
>   *by construction* that the context information *can't* be changed.
>
>   Another option would be to have some way of annotation in a
>   -record declaration that a field cannot be updated.
>
>   I prefer the segregation approach on the grounds of no language
>   change being needed and the improved efficiency of not copying
>   fields that can't have changed.  Others might prefer the revise
>   -record approach on the grounds of not having to change or
>   duplicate the OTP behaviours.
>
>   I had to reach each file in detail
>   - to find that certain fields *happened* not to be changed
>   - to understand the design well enough to tell that this was
>     almost certainly deliberate.
>
> WE DOCUMENT THE WRONG THINGS.
>
>   It's well known that there are two kinds of documentation,
>   "external" documentation for people writing clients of a module,
>   and "internal" documentation for people maintaining the module
>   itself.  It's also well known that the division is simplistic;
>   if the external documentation is silent about material points
>   you have to read the internal documentation.
>
>   In languages like Prolog and Erlang and Scheme where you build
>   data structures out of existing "universal" types and have no
>   data structure declarations, we tend to document procedures
>   but not data.  This is backwards.  If you understand the data,
>   and especially its invariants, the code is often pretty obvious.
>
>   There are two examples of this in TinyMQ.  One is specific to
>   TinyMQ.  The other other is nearly universal in Erlang practice.
>
>   Erlang systems are made of lots of processes sending messages
>   to each other.  Joe Armstrong has often said THINK ABOUT THE
>   PROTOCOLS.  But Erlang programmers very seldom *write* about
>   the protocols.
>
>   Using the OTP behaviours, a "concurrent object" is implemented
>   as a module with a bunch of interface functions that forward
>   messages through the OTP layer to the callback code managed by
>   whatever behaviour it is.  This protocol is unique to each kind
>   of concurrent object.  It's often generated in one module (the
>   one with the interface functions) and consumed in another (the
>   one with the callback code), as it is in TinyMQ.  And it's not
>   documented.
>
>   It is possible to reconstruct this protocol by reading the code
>   in detail and noting down what you see.  It is troublesome when,
>   as in TinyMQ, the two modules disagree about the protocol.  It's
>   clear that _something_ is wrong, but what, exactly?
>
>   For example, tinymq_controller has a case
>         handle_cast({set_max_age, newMaxAge}, State) ->
>   but this is the only occurrence of set_max_age anywhere in TinyMQ.
>   Is its presence in tinymq_controller an example of dead code,
>   or is its absence from the rest of the application an example
>   of missing code?  The same question can be asked about 'expire'
>   (which would forget a channel without making it actually go away,
>    if it could ever be invoked, which it can't.)
>
>   Almost as soon as I started reading Erlang code many years ago
>   it seemed obvious to me that documenting (and if possible, type
>   checking) these internal protocols was a very important part of
>   Erlang internal documentation.  There must be something wrong
>   with my brain, because other people don't seem to feel this lack
>   anywhere nearly as strongly as I do.  I think Joe Armstrong sort
>   of sees this at the next level up or he would never have invented
>   UBF.
>
>   But Occam, Go, and Sing# have typed channels, so they *are*
>   addressing the issue, and *do* have a natural central point to
>   document what the alternatives of an internal protocol signify.
>
>   Another documentation failure is that we fail to document what
>   is not there.  In TinyMQ, a channel automatically comes into
>   existence when you try to use it.  Perhaps as a consequence of
>   this, there is no way to shut a channel down.  In TinyMQ, old
>   messages are not removed from a channel when they expire, but
>   the next time someone does a 'subscribe' (waves hands) or a 'poll'
>   or a 'push' *after* they expire.  So if processes stop sending
>   and requesting messages to some channel, the last few messages,
>   no matter how large, may hang around forever.  I'm sure there
>   is a reason, but because it's a reason for something *not* being
>   there, there's no obvious place to hang the comment, and there
>   isn't one.  (Except for the dead 'expire' clause mentioned above.)
>
> IT'S HARD TO SPOT SALIENT DETAIL IN A SEA OF GLUE CODE.
>
>   The central fact about TinyMQ is that it holds the messages of
>   a channel in a simple list of {Message, Timestamp} pairs.  As
>   a result, every operation on the data takes time linear in the
>   current size.
>
>   This is not stated anywhere in any comments nor in the README.
>   You have to read the code in detail to discover this.  And it
>   is a rather nasty surprise.  If a channel holds N messages,
>   the operations *can* be done in O(log(N)) time.  (I believe it
>   is possible to do even better.)  Some sliding window applications
>   have a bound on the number of elements in the window.  This one
>   has a bound on the age of elements, but they could arrive at a
>   very high rate, so N *could* get large.
>
>   It is very easy to implement the necessary operations using lists,
>   so much so that they are present in several copies.  Revising the
>   TinyMQ implementation to work better with long queues would be
>   harder than necessary because of this.  And this goes un-noticed
>   because there is so much glue code for the guts to get lost in.
>
>   Given that Evan Miller took the trouble to use library components
>   for structuring this application, why didn't he take the next step,
>   and use the existing 'sliding window' library data structure?
>
>         Because there is none!
>
>   Yet sliding windows of one sort or another have come up before in
>   this mailing list.  Perhaps we should have a Wiki page on
>   trapexit to gather requirements for one or more sliding window
>   libraries.  Or perhaps not.  "true religion jeans for women" --
>   what has that or "Cheap Nike Shoes" to do with Erlang/OTP
>   (http://www.trapexit.org/forum/viewforum.php?f=20)?
>
>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120831/6922e104/attachment.htm>