[erlang-questions] Best practice for defining functions with edoc,erlc,eunit and the dialyzer

Fri Dec 4 00:49:07 CET 2009

On Dec 3, 2009, at 10:55 PM, Zoltan Lajos Kis wrote:
> I prefer to group the "API" functions based on their functionality

If their functionality isn't too close to make this easy,
why are they in the same module?

OK, let's take Smalltalk as a model, and consider a Set data type

-export([
   % @category Instance creation
     empty/0,			% empty() is empty
     from_list/1,		% from_list([X1,...,Xn]) has the listed members
     singleton/1,		% singleton(X) has X as only member
   % @category Size
     is_empty/1,			% is_empty(S) iff S is empty()
     not_empty/1,		% not_empty(S) iff not is_empty(S)
     size/1,			% size(S) is number of elements
   % @category Single element operations
     arb/1,			% arb(S) when not_empty(S) is some element
     excludes/2,			% excludes(S, X) iff X is not in S
     includes/2,			% includes(S, X) iff X is in S
     sans/2,			% sans(S, X) is S \ {X}
     with/2,			% with(S, X) is S U {X}
   % @category Whole container comparisons
     are_equal/2,		% are_equal(S1, S2) iff S1\S2 and S2\S1 empty
     not_equal/2,		% not_equal(S1, S2) iff S1 S2 different as sets
     excludes_all/2,		% excludes_all(S1, S2) iff no overlap
     excludes_any/2,		% excludes_any(S1, S2) iff not_empty(S2\S1)
     includes_all/2,		% includes_all(S1, S2) iff is_empty(S2\S1)
     includes_any/2,		% includes_any(S1, S2) iff some overlap
   % @category Whole container operations
     difference/2,		% difference(S1, S2) = S1 \ S2
     intersection/2,		% intersection(S1, S2)
     union/2,			% union(S1, S2) = S1 U S2
   % @category Iteration
     all_satisfy/2,		% all_satisfy(S, P) when P(X) for all X in S
     any_satisfy/2,		% any_satisfy(S, P) when P(X) for some X in S
     fold/3,			% fold(F, A, S)
     map/2,			% map(F, S) = {F(X) | X <- S}
     none_satisfy/2,		% non_satisfy(S, P) = not any_satisfy(S, P)
   % @category Conversion
     to_list/1			% to_list(S) lists elements in unspecified order
]).

Is this _really_ more useful than a single alphabetic list?
(It helps when there is a consistently applied vocabulary of  
categories.)

Take a look at
http://hackage.haskell.org/packages/archive/logfloat/0.12.0.1/doc/html/Data-Number-LogFloat.html
for functions in groups.

It can be a useful indexing tool WHEN YOU ARE SEARCHING THE  
DOCUMENTATION.
But the order in which something is documented and the order in which it
is presented in the code need not be the same.

At least let the functions in each group be presented alphabetically.

> and
> order them by the order they will most probably be used.

Why do you think that is a good order?

If someone is reading for the first time, the order they need is
some sort of logical dependency order, and your "my guess about
your frequency of use" order gets in the way.

If someone is re-reading, chances are they're looking for a
particular topic, in which case the frequency-guess order at
best doesn't help.  (The thing you most need to read about
is probably the thing you use least often, because it's the
thing you reinforce your memory of least.)
>>
>> Years ago I recommended that the syntax should be extended
>> to
>> 	-behaviour(Behaviour, [Callback...]).
>> so that a cross-checking tool could tell that these functions
>> were *intended* to be used as callbacks by that behaviour and
>> weren't just accidentally adjacent.
>
> A simple -behavior(Behavior). could be interpreted by the compiler as
> exporting all of the behavior's callback functions. That would  
> trigger a
> compiler error if you forgot to implement one.

Yes, but it wouldn't be as helpful for the human reader.
>>>
>>> %% Internal functions
>>> -export([spawnee/0, applyee/2, ...]).
>>
>> Now that we have spawn(fun () -> ... end) and F(...),
>> we shouldn't need this group at all.
>>
>
> In general you can argue that all of these internal functions can be
> handled as callbacks, and thus put into behaviors, and exported as  
> such.

Possibly true, but NOT the point I was making here.
>
> Nevertheless grep for "internal exports" in the Erlang/OTP source.  
> There
> is quite a lot of them.

Let's take pool.erl as an example.
-export([statistic_collector/0,
          do_spawn/4,
          init/1,
          handle_call/3,
          handle_cast/2,
          handle_info/2,
          terminate/2]).

Let's take the first function in that list.  It is exported
because of these two calls:

	spawn_link(pool, statistic_collector, []),
	spawn_link(Node, pool, statistic_collector, []),

This is the old way of doing things.  These days, those calls
can be written as

	spawn_link(fun () -> pool:statistic_collector() end),
	spawn_link(Node, fun () -> pool:statistic_collector() end),

with no need for an export.

Now let's look at do_spawn/4.  This time there's one call.

	Pid = spawn(N, pool, do_spawn, [Gl, M, F, A]),

which could now be

	Pid = spawn(N, fun () -> pool:do_spawn(Gl, M, F, A) end),

and then we could eliminate do_spawn/4,

	Pid = spawn(N, fun () ->
	    group_leader(Gl, self()),
	    apply(M, F, A)
	end),

The remaining "Internal exports", init/1, handle_call/3, handle_cast/2,
handle_info/2, and terminate/2 are precisely callbacks for gen_server,
so they fall into the class of "behaviour callbacks" for which we
agreed that a second export list was appropriate.  As behaviour
call-backs, they should not be described as "Internal exports".

Let's try another file with "Internal exports".
proc_lib.erl has one, wake_up/3.  That's used here:

hibernate(M, F,A) when is_atom(M), is_atom(F), is_list(A) ->
     erlang:hibernate(proc_lib, wake_up, [M,F,A]).

This one currently has to survive because there is no
erlang:hibernate/1, though it's not clear why.  If there were,
it could be

hibernate(Fun) ->
     erlang:hibernate(fun () -> wake_up(Fun) end).

Just because there _are_ lots of "Internal exports" doesn't
mean that there NOW _should be_.

I've written up an EEP for this, but on reflection, I believe that
library changes are supposed to be posted to this list, so my next
message will be exactly that.