[erlang-questions] dialyzer output help

Mon Aug 10 17:32:02 CEST 2015

I wanted to comment here on specific points, so I've taken your email 
apart and reassembled it for what I felt fit a better narrative for my 
stuff. Please don't see some dishonesty in the argument there (creating 
a strawman or whatever) as this isn't my intent :)

On 08/10, Garrett Smith wrote:
>Back to comments - my point is that your *code* is the docs, as far as
>implementation. If your code is not clear, make it clear. If you can't
>make it clear, try harder. There are some very rare cases where some
>code is just hard and some comments can help clarify what's going on -
>but these are *unicorn rare* in my experience.
>

I've conducted a poll a few years ago (2012) on Erlang and Maintenance 
in this community (http://ferd.ca/poll-results-erlang-maintenance.html) 
and one of the interesting results was in the following image (also 
attached):

http://ferd.ca/static/img/charts/correlations_comments-erlang.png

The idea that comments are not required tends to decrease with 
experience in the language and familiarity with it. At the very least, 
there was decent correlation.

I like to think of it as someone learning to read: you first start by 
reading out loud as you read; it helps build connections and solidify 
your grasp on language. As you get a more profficient reader, you 
subvocalize more and more up to the point where speed readers do not 
even 'mentally read' words -- they just glance over the text and can 
understand meaning from it anyway.

Newcomers writing comments in the code, as far as I can rationalize it, 
tends to be about writing down a translation of what the code should be 
doing against the actual implementation.

As you get more familiar, that translation takes place automatically and 
the comments become more useless.

If I had to give a tip to newcomers, it would be that it's fine to write 
these comments first -- they act as a bit of a specification or pseudo 
code. As you go on, the refactoring Garrett mentions would be to try, as 
much as possible, to make the logic in your code look like the comments 
you wrote, so that they become obsolete and can be dropped.

Let's take some fake code as an example:

    %% Verify that only adults (18 years or older) can enter
    can_enter(Age) when Age >= 18 -> true;
    can_enter(_) -> false.

So that's a decent specification, and a newcomer gets to go "oh okay, 
Age is the variable, >= is how they compare here, not =>, and apparently 
the clauses are alternatives".

As you get more used to it, you lose the interest for the comment, but 
there's still a lot of interesting stuff left untold:

- The function assumes that age is the accepted factor for adulthood
- The function assumes that the program runs in a jurisdiction where 
  adulthood is defined as 18, rather than 16, 19, or 21, for example, or 
  that 
- The function only expects numbers to be passed in.

So a more complete form could be imagined to be written like:

    %% Age of adulthood in Quebec
    -define(ADULT_AGE, 18).
    -type age() :: number().

    %% Age verification is required by the government
    %% on online betting websites; credit cards would work
    %% but can sometimes be granted to minors and isn't sufficient
    %% for proper verification. However, just asking for age
    %% can legally be considered good enough.
    -spec can_enter(age()) -> boolean().
    can_enter(Age) when Age >= ?ADULT_AGE -> true;
    can_enter(_) -> false.

And now the code has more criteria encoded. You'll noticed I pulled a 
bunch of information out of my rear end just to add comments, but 
really, those are the tricky things in code that bite us and are worth 
putting in there, on top of everything else.

Why? Because the next time someone comes in and refactors, they know why 
the code was added, and it makes it so, so much easier to make a good 
decision on whether it should be removed or not without having to hope 
it's somewhere in source control.

What should be kept in comments? The stuff that is hard to represent in 
code:

- Why you picked a given algorithm or implementation details
- Gotchas are how to use the code
- Edge conditions that cannot be represented otherwise (through type 
  specifications and whatnot) in your language of choice
- The rationale for broader design decisions. "We picked UUIDs for
  identifiers because the system requires sharding and it is easier
  with them than sequential ids" Hell: should the ids be considered
  opaque or not?

Details about what the code *does* are interesting to newcomers, still, 
so part of the question you have to ask yourself is: who is my target 
audience? Who will read this code?

Even if the need is not there for veteran Erlangers, it might be there 
for newcomers. If you're the senior member (or consultant) on a team 
with a lot of new people, it might be interesting to put these 'useless' 
comments in to onboard everyone else.

Development is extremely human, and knowing that such comments are 
important to some people means that they *should* be in your toolbox if 
you work with them, or want them to work with you.

>Documenting user facing functions, on the other hand, is important to
>help your users understand how to use a function. It doesn't
>necessarily cover each and every implementation detail - for that you
>can always read the code - but it should provide what a user needs to
>use the function as it was intended to be used.
>

Yes! But "read the code" is a pisspoor thing to tell almost *anyone*.  
The reason is that there are many types of documentation for different 
audiences:

Newcomers: what does it do, why did you write it, who should or 
shouldn't use this piece of code, where can additional info be found, 
short examples, etc.

Regulars: Reference manuals, examples, edoc, wikis, websites, api 
descriptions

Contributors: architecture, project structure, principles guiding the 
design ("we favor the users on errors", "the state on disk cannot be 
left inconsistent and we prefer to lose data", etc.), tests: where to 
find them and how to run them, issue tracker, **the source**.

If anyone but a contributor (or someone looking for contributor-level 
information) has to read the source, your doc is not good enough.

Regulars and newcomers shouldn't have to care about your code and how it 
does things; they care about whether they should use it, what it does, 
and how to use things.

Only contributors or people debugging your stuff tend to care about
*how* it does things or *why* it does it.

I did expand a bunch on this in a blog post before, so I'll just link to 
it rather than filling everyone's mailboxes even more: 
http://ferd.ca/don-t-be-a-jerk-write-documentation.html

Regards,
Fred.