[erlang-questions] What I dislike about Erlang
Rapsey
rapsey@REDACTED
Fri Aug 31 09:43:41 CEST 2012
My biggest gripe with erlang are the limitations of records. Anyone know
when frames will make an appearance?
Sergej
On Fri, Aug 31, 2012 at 8:20 AM, Richard O'Keefe <ok@REDACTED> wrote:
> We've just had a thread about what people like about Erlang.
> We also had the announcement of TinyMQ.
> So I'm going to use this as an example of what's *really*
> wrong with Erlang.
>
> Don't get me wrong. I endorse everything everyone else has
> said in favour of Erlang. Erlang is like democracy: the worst
> thing in its class except for all the others, and something
> that is increasingly imitated by people who just don't get
> some of the fundamental things about it.
>
> I also endorse what people have said in praise of TinyMQ.
> There are lots of things that it does right:
> - there is a README
> - there are EDoc comments with @specs for the public
> interface
> - the functions and variables are named well enough that
> I was never in doubt about what any part of the code was
> up to, at least not for longer than a second or two
> - the hard work of process management is delegated to OTP
> behaviours
> At this point, it's looking better than anything I've written.
>
> Make no mistake: I am not saying that Erlang or TinyMQ are *bad*.
> They are good things; I'm just ranting somewhat vaguely about
> why they should be better.
>
>
> LUMPS OF INDISTINGUISHABLE CODE.
>
> Up to a certain level of hand-waving, TinyMQ can be roughly
> understood thus:
> The TinyMQ *system* is a monitor
> guarding a dictionary mapping strings to channnels,
> where
> a channel is a monitor
> guarding a bag of subscribers and
> a sliding window of {Message, Timestamp} pairs.
>
> YOU CANNOT SEE THIS AT A GLANCE.
>
> This is not Evan Miller's fault. *Anything* you write in
> Erlang is going to end up as lumps of indistinguishable code,
> because there is nothing else for it to be.
>
> This is also true in C, C++, Java, C#, Javascript, Go,
> Eiffel, Smalltalk, Prolog, Haskell, Clean, SML, ...,
> not to mention Visual Basic and Fortran.
>
> Almost the only languages I know where it doesn't *have* to
> be true are Lisp, Scheme, and Lisp-Flavoured Erlang. Arguably
> Prolog *could* be in this group, but in practice it usually is
> in the other camp. Thanks to the preprocessor, C *can* be
> made rather more scrutable, but for some reason this is frowned on.
>
> There's the e2 project (http://e2project.org) which is a step
> in a good direction, but it doesn't do much about this problem.
> A version of TinyMQ using e2_service instead of gen_server
> would in fact exacerbate the problem by mushing
> handle_call/3, handle_cast/2, and handle_info/2 into one
> function, turning three lumps into one bigger lump.
>
> LUMPS OF DATA.
>
> Take tinymq_channel_controller as an example.
> Using an OTP behaviour means that all six dimensions of the state
> are mushed together in one data structure. This goes a long way
> towards hiding the fact that
>
> supervisor, channel, and max_age are never changed
> messages, subscribers, and last_pull *are* changed.
>
> One teeny tiny step here would be to offer an alternative set of
> callbacks for some behaviours where the "state" is separated into
> immutable "context" and mutable "state", so that it is obvious
> *by construction* that the context information *can't* be changed.
>
> Another option would be to have some way of annotation in a
> -record declaration that a field cannot be updated.
>
> I prefer the segregation approach on the grounds of no language
> change being needed and the improved efficiency of not copying
> fields that can't have changed. Others might prefer the revise
> -record approach on the grounds of not having to change or
> duplicate the OTP behaviours.
>
> I had to reach each file in detail
> - to find that certain fields *happened* not to be changed
> - to understand the design well enough to tell that this was
> almost certainly deliberate.
>
> WE DOCUMENT THE WRONG THINGS.
>
> It's well known that there are two kinds of documentation,
> "external" documentation for people writing clients of a module,
> and "internal" documentation for people maintaining the module
> itself. It's also well known that the division is simplistic;
> if the external documentation is silent about material points
> you have to read the internal documentation.
>
> In languages like Prolog and Erlang and Scheme where you build
> data structures out of existing "universal" types and have no
> data structure declarations, we tend to document procedures
> but not data. This is backwards. If you understand the data,
> and especially its invariants, the code is often pretty obvious.
>
> There are two examples of this in TinyMQ. One is specific to
> TinyMQ. The other other is nearly universal in Erlang practice.
>
> Erlang systems are made of lots of processes sending messages
> to each other. Joe Armstrong has often said THINK ABOUT THE
> PROTOCOLS. But Erlang programmers very seldom *write* about
> the protocols.
>
> Using the OTP behaviours, a "concurrent object" is implemented
> as a module with a bunch of interface functions that forward
> messages through the OTP layer to the callback code managed by
> whatever behaviour it is. This protocol is unique to each kind
> of concurrent object. It's often generated in one module (the
> one with the interface functions) and consumed in another (the
> one with the callback code), as it is in TinyMQ. And it's not
> documented.
>
> It is possible to reconstruct this protocol by reading the code
> in detail and noting down what you see. It is troublesome when,
> as in TinyMQ, the two modules disagree about the protocol. It's
> clear that _something_ is wrong, but what, exactly?
>
> For example, tinymq_controller has a case
> handle_cast({set_max_age, newMaxAge}, State) ->
> but this is the only occurrence of set_max_age anywhere in TinyMQ.
> Is its presence in tinymq_controller an example of dead code,
> or is its absence from the rest of the application an example
> of missing code? The same question can be asked about 'expire'
> (which would forget a channel without making it actually go away,
> if it could ever be invoked, which it can't.)
>
> Almost as soon as I started reading Erlang code many years ago
> it seemed obvious to me that documenting (and if possible, type
> checking) these internal protocols was a very important part of
> Erlang internal documentation. There must be something wrong
> with my brain, because other people don't seem to feel this lack
> anywhere nearly as strongly as I do. I think Joe Armstrong sort
> of sees this at the next level up or he would never have invented
> UBF.
>
> But Occam, Go, and Sing# have typed channels, so they *are*
> addressing the issue, and *do* have a natural central point to
> document what the alternatives of an internal protocol signify.
>
> Another documentation failure is that we fail to document what
> is not there. In TinyMQ, a channel automatically comes into
> existence when you try to use it. Perhaps as a consequence of
> this, there is no way to shut a channel down. In TinyMQ, old
> messages are not removed from a channel when they expire, but
> the next time someone does a 'subscribe' (waves hands) or a 'poll'
> or a 'push' *after* they expire. So if processes stop sending
> and requesting messages to some channel, the last few messages,
> no matter how large, may hang around forever. I'm sure there
> is a reason, but because it's a reason for something *not* being
> there, there's no obvious place to hang the comment, and there
> isn't one. (Except for the dead 'expire' clause mentioned above.)
>
> IT'S HARD TO SPOT SALIENT DETAIL IN A SEA OF GLUE CODE.
>
> The central fact about TinyMQ is that it holds the messages of
> a channel in a simple list of {Message, Timestamp} pairs. As
> a result, every operation on the data takes time linear in the
> current size.
>
> This is not stated anywhere in any comments nor in the README.
> You have to read the code in detail to discover this. And it
> is a rather nasty surprise. If a channel holds N messages,
> the operations *can* be done in O(log(N)) time. (I believe it
> is possible to do even better.) Some sliding window applications
> have a bound on the number of elements in the window. This one
> has a bound on the age of elements, but they could arrive at a
> very high rate, so N *could* get large.
>
> It is very easy to implement the necessary operations using lists,
> so much so that they are present in several copies. Revising the
> TinyMQ implementation to work better with long queues would be
> harder than necessary because of this. And this goes un-noticed
> because there is so much glue code for the guts to get lost in.
>
> Given that Evan Miller took the trouble to use library components
> for structuring this application, why didn't he take the next step,
> and use the existing 'sliding window' library data structure?
>
> Because there is none!
>
> Yet sliding windows of one sort or another have come up before in
> this mailing list. Perhaps we should have a Wiki page on
> trapexit to gather requirements for one or more sliding window
> libraries. Or perhaps not. "true religion jeans for women" --
> what has that or "Cheap Nike Shoes" to do with Erlang/OTP
> (http://www.trapexit.org/forum/viewforum.php?f=20)?
>
>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120831/6922e104/attachment.htm>
More information about the erlang-questions
mailing list