My biggest gripe with erlang are the limitations of records. Anyone know when frames will make an appearance?<div><br></div><div><br></div><div>Sergej<br><br><div class="gmail_quote">On Fri, Aug 31, 2012 at 8:20 AM, Richard O'Keefe <span dir="ltr"><<a href="mailto:ok@cs.otago.ac.nz" target="_blank">ok@cs.otago.ac.nz</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">We've just had a thread about what people like about Erlang.<br>
We also had the announcement of TinyMQ.<br>
So I'm going to use this as an example of what's *really*<br>
wrong with Erlang.<br>
<br>
Don't get me wrong. I endorse everything everyone else has<br>
said in favour of Erlang. Erlang is like democracy: the worst<br>
thing in its class except for all the others, and something<br>
that is increasingly imitated by people who just don't get<br>
some of the fundamental things about it.<br>
<br>
I also endorse what people have said in praise of TinyMQ.<br>
There are lots of things that it does right:<br>
- there is a README<br>
- there are EDoc comments with @specs for the public<br>
interface<br>
- the functions and variables are named well enough that<br>
I was never in doubt about what any part of the code was<br>
up to, at least not for longer than a second or two<br>
- the hard work of process management is delegated to OTP<br>
behaviours<br>
At this point, it's looking better than anything I've written.<br>
<br>
Make no mistake: I am not saying that Erlang or TinyMQ are *bad*.<br>
They are good things; I'm just ranting somewhat vaguely about<br>
why they should be better.<br>
<br>
<br>
LUMPS OF INDISTINGUISHABLE CODE.<br>
<br>
Up to a certain level of hand-waving, TinyMQ can be roughly<br>
understood thus:<br>
The TinyMQ *system* is a monitor<br>
guarding a dictionary mapping strings to channnels,<br>
where<br>
a channel is a monitor<br>
guarding a bag of subscribers and<br>
a sliding window of {Message, Timestamp} pairs.<br>
<br>
YOU CANNOT SEE THIS AT A GLANCE.<br>
<br>
This is not Evan Miller's fault. *Anything* you write in<br>
Erlang is going to end up as lumps of indistinguishable code,<br>
because there is nothing else for it to be.<br>
<br>
This is also true in C, C++, Java, C#, Javascript, Go,<br>
Eiffel, Smalltalk, Prolog, Haskell, Clean, SML, ...,<br>
not to mention Visual Basic and Fortran.<br>
<br>
Almost the only languages I know where it doesn't *have* to<br>
be true are Lisp, Scheme, and Lisp-Flavoured Erlang. Arguably<br>
Prolog *could* be in this group, but in practice it usually is<br>
in the other camp. Thanks to the preprocessor, C *can* be<br>
made rather more scrutable, but for some reason this is frowned on.<br>
<br>
There's the e2 project (<a href="http://e2project.org" target="_blank">http://e2project.org</a>) which is a step<br>
in a good direction, but it doesn't do much about this problem.<br>
A version of TinyMQ using e2_service instead of gen_server<br>
would in fact exacerbate the problem by mushing<br>
handle_call/3, handle_cast/2, and handle_info/2 into one<br>
function, turning three lumps into one bigger lump.<br>
<br>
LUMPS OF DATA.<br>
<br>
Take tinymq_channel_controller as an example.<br>
Using an OTP behaviour means that all six dimensions of the state<br>
are mushed together in one data structure. This goes a long way<br>
towards hiding the fact that<br>
<br>
supervisor, channel, and max_age are never changed<br>
messages, subscribers, and last_pull *are* changed.<br>
<br>
One teeny tiny step here would be to offer an alternative set of<br>
callbacks for some behaviours where the "state" is separated into<br>
immutable "context" and mutable "state", so that it is obvious<br>
*by construction* that the context information *can't* be changed.<br>
<br>
Another option would be to have some way of annotation in a<br>
-record declaration that a field cannot be updated.<br>
<br>
I prefer the segregation approach on the grounds of no language<br>
change being needed and the improved efficiency of not copying<br>
fields that can't have changed. Others might prefer the revise<br>
-record approach on the grounds of not having to change or<br>
duplicate the OTP behaviours.<br>
<br>
I had to reach each file in detail<br>
- to find that certain fields *happened* not to be changed<br>
- to understand the design well enough to tell that this was<br>
almost certainly deliberate.<br>
<br>
WE DOCUMENT THE WRONG THINGS.<br>
<br>
It's well known that there are two kinds of documentation,<br>
"external" documentation for people writing clients of a module,<br>
and "internal" documentation for people maintaining the module<br>
itself. It's also well known that the division is simplistic;<br>
if the external documentation is silent about material points<br>
you have to read the internal documentation.<br>
<br>
In languages like Prolog and Erlang and Scheme where you build<br>
data structures out of existing "universal" types and have no<br>
data structure declarations, we tend to document procedures<br>
but not data. This is backwards. If you understand the data,<br>
and especially its invariants, the code is often pretty obvious.<br>
<br>
There are two examples of this in TinyMQ. One is specific to<br>
TinyMQ. The other other is nearly universal in Erlang practice.<br>
<br>
Erlang systems are made of lots of processes sending messages<br>
to each other. Joe Armstrong has often said THINK ABOUT THE<br>
PROTOCOLS. But Erlang programmers very seldom *write* about<br>
the protocols.<br>
<br>
Using the OTP behaviours, a "concurrent object" is implemented<br>
as a module with a bunch of interface functions that forward<br>
messages through the OTP layer to the callback code managed by<br>
whatever behaviour it is. This protocol is unique to each kind<br>
of concurrent object. It's often generated in one module (the<br>
one with the interface functions) and consumed in another (the<br>
one with the callback code), as it is in TinyMQ. And it's not<br>
documented.<br>
<br>
It is possible to reconstruct this protocol by reading the code<br>
in detail and noting down what you see. It is troublesome when,<br>
as in TinyMQ, the two modules disagree about the protocol. It's<br>
clear that _something_ is wrong, but what, exactly?<br>
<br>
For example, tinymq_controller has a case<br>
handle_cast({set_max_age, newMaxAge}, State) -><br>
but this is the only occurrence of set_max_age anywhere in TinyMQ.<br>
Is its presence in tinymq_controller an example of dead code,<br>
or is its absence from the rest of the application an example<br>
of missing code? The same question can be asked about 'expire'<br>
(which would forget a channel without making it actually go away,<br>
if it could ever be invoked, which it can't.)<br>
<br>
Almost as soon as I started reading Erlang code many years ago<br>
it seemed obvious to me that documenting (and if possible, type<br>
checking) these internal protocols was a very important part of<br>
Erlang internal documentation. There must be something wrong<br>
with my brain, because other people don't seem to feel this lack<br>
anywhere nearly as strongly as I do. I think Joe Armstrong sort<br>
of sees this at the next level up or he would never have invented<br>
UBF.<br>
<br>
But Occam, Go, and Sing# have typed channels, so they *are*<br>
addressing the issue, and *do* have a natural central point to<br>
document what the alternatives of an internal protocol signify.<br>
<br>
Another documentation failure is that we fail to document what<br>
is not there. In TinyMQ, a channel automatically comes into<br>
existence when you try to use it. Perhaps as a consequence of<br>
this, there is no way to shut a channel down. In TinyMQ, old<br>
messages are not removed from a channel when they expire, but<br>
the next time someone does a 'subscribe' (waves hands) or a 'poll'<br>
or a 'push' *after* they expire. So if processes stop sending<br>
and requesting messages to some channel, the last few messages,<br>
no matter how large, may hang around forever. I'm sure there<br>
is a reason, but because it's a reason for something *not* being<br>
there, there's no obvious place to hang the comment, and there<br>
isn't one. (Except for the dead 'expire' clause mentioned above.)<br>
<br>
IT'S HARD TO SPOT SALIENT DETAIL IN A SEA OF GLUE CODE.<br>
<br>
The central fact about TinyMQ is that it holds the messages of<br>
a channel in a simple list of {Message, Timestamp} pairs. As<br>
a result, every operation on the data takes time linear in the<br>
current size.<br>
<br>
This is not stated anywhere in any comments nor in the README.<br>
You have to read the code in detail to discover this. And it<br>
is a rather nasty surprise. If a channel holds N messages,<br>
the operations *can* be done in O(log(N)) time. (I believe it<br>
is possible to do even better.) Some sliding window applications<br>
have a bound on the number of elements in the window. This one<br>
has a bound on the age of elements, but they could arrive at a<br>
very high rate, so N *could* get large.<br>
<br>
It is very easy to implement the necessary operations using lists,<br>
so much so that they are present in several copies. Revising the<br>
TinyMQ implementation to work better with long queues would be<br>
harder than necessary because of this. And this goes un-noticed<br>
because there is so much glue code for the guts to get lost in.<br>
<br>
Given that Evan Miller took the trouble to use library components<br>
for structuring this application, why didn't he take the next step,<br>
and use the existing 'sliding window' library data structure?<br>
<br>
Because there is none!<br>
<br>
Yet sliding windows of one sort or another have come up before in<br>
this mailing list. Perhaps we should have a Wiki page on<br>
trapexit to gather requirements for one or more sliding window<br>
libraries. Or perhaps not. "true religion jeans for women" --<br>
what has that or "Cheap Nike Shoes" to do with Erlang/OTP<br>
(<a href="http://www.trapexit.org/forum/viewforum.php?f=20" target="_blank">http://www.trapexit.org/forum/viewforum.php?f=20</a>)?<br>
<br>
<br>
<br>
<br>
<br>
_______________________________________________<br>
erlang-questions mailing list<br>
<a href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br>
<a href="http://erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br>
</blockquote></div><br></div>