[erlang-questions] DRY principle and the syntax inconsistency in fun vs. vanilla functions

Fri May 20 09:42:04 CEST 2011

On 19/05/2011, at 2:18 AM, Michael Turner wrote:

> Another objection raised against this syntax change is that all functional languages violate DRY in this manner, so it's OK if Erlang does it too. This is a Principle of Least Surprise argument, and not bad as far as it goes. But how far does it go?
> 
> Erlang will have a better chance of greater success and survival if you consider your recruitment base to be the overwhelming majority of programmers.

I had a lengthy response to this which SquirrelMail (apparently that's the name of the
WebMail system here) managed to destroy.  Sigh.

Erlang has already survived very well for 30+ years with pretty much its present syntax.
It has also succeeded very well.  

> And from what I can tell,
> 
>   http://www.langpop.com
> 
> the overwhelming majority of programmers have no experience to speak of,

Erlang's strengths come from being DIFFERENT.

There's a thing much discussed, much derided, and much practiced in Australia
and New Zealand, called "cultural cringe".  Look it up.  Cultural cringe hurt
ISO Prolog.  (Prolog _had_ a notation for integers in bases 2..36, but the
ISO committee threw it away and copied C instead, for base 16 only.  They
also changed the precedence of at least one operator to match C better than Prolog.)
If we go by fashion, we'll change Erlang to look like PHP or COBOL.
(Estimates of COBOL code volume range from 1e11 to 5e11 lines.)

Oh, I'd like all sorts of changes to Erlang syntax.  After Haskell, I find it
quite verbose.  I once designed a "Haskerl" syntax for Erlang that reduced the
line count by a factor of 1.6.  But syntax is not the most important thing about
Erlang (or LFE would not be of interest), and it's not really what needs the
most attention.  I think the Dialyzer is easily the most dramatic and important
thing that's happened to Erlang in some time, and if we want to talk about
improvements, that might have the highest payoff.

> when it comes to functional programming languages. Appealing to the "cross-training" benefit of coming from other FP languages seems like a pretty weak argument. Especially since all I'm asking for here is syntactic consistency *within* Erlang -- a PLoS argument in itself.
> 
> Richard O'Keefe suggests that the syntactic consistency goal is better met by allowing a kind of limited-scope special-case fun name, permitting, e.g.
> 
>   Fact = fun F(0) -> 1; F(N) -> N*F(N-1) end.
> 
> I could get behind that too, but I don't follow his reasoning from syntactic consistency, which is apparently that an unnamed fun has a name, it's just the degenerate case of a name.  It's really there. We just can't see it. Hm, really? If that were true in Erlang as it stands, shouldn't I be able to write it this way?
> 
>   Fact = fun (0) -> 1; (N) -> N*''(N-1) end.

Because (a) the generalisation I suggested was
	fun <optname> <args> <etc>
	 {; <optname> <args> <etc>}...
	end
where <optname> is either a VARIABLE or omitted,
and is identical in each clause.
But '' is NOT a variable, now is it?

And (b) the generalisation I suggested was to use
the SAME <optname> in each clause, and a completely
missing name is NOT the same as an empty but present
atom.  (This is like the way that in SQL92 a NULL
string is very different from an empty string.)

And note that I did not say that this is what Erlang *DOES*
do but what it might some day be *EXTENDED* to do as a way
of letting people write recursive funs.

By the way, some earlier posting from somebody said that
Erlang was alone in allowing multiple clauses in funs.
Not so.

Erlang Haskell ML
yes    yes     no     Are multiple arguments allowed in a lambda?
yes    no      yes    Are multiple clauses allowed in a lambda?

Erlang:
    fun ([_|_]) -> true
      ; ([])    -> false
    end

Standard ML:
    fn (_:_) -> true
     | []    -> false

While you *can* write

	val default = fn (SOME x) _ -> x
                       | NONE     d -> d

in SML, it is normal practice to write

	fun default (SOME x) _ = x
          | default NONE     d = d

and I think most SML programmers would regard the first alternative
as perversely unreadable.

> 
> 
> What Richard's suggesting appears to require a rigorous re-think of how scopes are defined in Erlang.

Not in the least.  The scope of a function name variable would be the function itself,
just like any other variable in the arguments of the clauses.

1> X = 1.
1
2> F = fun (X) -> ok ; (_) -> uh_oh end.
#Fun<erl_eval.6.13229925>
3> F(2).
ok

See how the X in the fun was *not* the X outside?
Same rule exactly for fun-names.

>  W

> hat I'm suggesting amounts to simply asking the compiler to do some of your tedious keying for you.
> 
> -michael turner

(A) If repeating the function name is the most tedious keying you have,
    how fortunate you are!
(B) People with a decent text-editor don't have a problem anyway.
    One reason for putting case, if, fun, and receive semicolons at
    the beginning of the line is so that when you end a line with a
    semicolon, the editor can automatically
	- go back to the first preceding line to have a letter or '
	  in column 1;
	- copy the function name into a work space
	- go back where it started
	- insert the work space.
    I did this years ago for Prolog.  I found it distracting, so
    switched the feature off, but anyone who really really hated
    retyping function names should be able to program this in a
    few minutes.
(C) The names are not there for the benefit of the compiler, but
    for the benefit of PEOPLE.  When you are reading someone else's
    code, it HELPS to see the function name in each clause.
    And that's why I'm willing to give head-room to fun-names.
    I really loathe seeing a bare argument list without any indication
    of what it's the argument list *of*.
>  
> 
> On Wed, May 18, 2011 at 6:16 PM, Michael Turner <michael.eugene.turner@REDACTED> wrote:
> I can say
> 
>    fun (1)->2;
>         (2)->1
>    end
> 
> but, oddly, I can't define a named function in the analogous way, e.g.:
> 
>    factorial
>      (1) -> 1;
>      (N) -> N*factorial(N-1).
> 
> gives me a syntax error. I find the latter more readable than
> 
>    factorial(1) -> 1;
>    factorial(2) -> N*fact(N-1).

I do not.  In fact I find the first version of factorial rather horrible.
If you really want to do that, you can write

    factorial(N) -> case N of
      (1) -> 1;
      (N) -> N*factorial(N-1) end.

right now.  Just don't ask me to read it for anything less than NZD 400/hour.
> 
> It's also less to type and to read, which is consistent with the DRY principle ("Don't Repeat Yourself").

The DRY principle was not handed down on Mt Sinai.
It is a rule of thumb, no more and no less.
And indeed, it violates something I have found to be an excellent
guide, that a *controlled* use of redundancy is an aid to correctness.

For example, :- spec is redundant, but
- it lets Erlang detect a difference between what you DID write and
  what you MEANT to write
- someone can read the :- spec without having to read the code.

There are other examples of good redundancy in Erlang.

The keyword here, of course, is *controlled* redundancy; just enough to
do something useful (like helping human beings read your code, or
permitting some check), not too much.

> It also looks a *little* bit more like the mathematical convention for defining these sorts of functions, where you have "f(x) = ", centered vertically to the left of a big left "{" that (left-)encloses the list of expression/parameter-condition pairs in a two-column format, e.g.,

That is one mathematical convention, followed in Miranda, and leading to
guards in Haskell and Erlang.  It is regrettable that SML does not have
them, but at least there's the 'nowhere' preprocessor to provide them.

There are other mathematical conventions.
"The Fibonacci numbers are defined using the linear recurrence relation

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 4759a679404fa38e0954b0adffcfd960.png
Type: image/png
Size: 479 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110520/e680db17/attachment.png>
-------------- next part --------------

 with seed values:

-------------- next part --------------
A non-text attachment was scrubbed...
Name: de489126f6f0351ecd9112c3d59a40be.png
Type: image/png
Size: 354 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110520/e680db17/attachment-0001.png>
-------------- next part --------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: a0d53c9950d78244ed03c613cad5bdd1.png
Type: image/png
Size: 304 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110520/e680db17/attachment-0002.png>
-------------- next part --------------

" (Wikipedia entry on recurrence relations).

This is the style that Haskell, Clean, Mercury, SML, CAML, and so on follow.

> It seems to me that, if anything, this requires only a *simplification* of the Erlang parser. That leaves only one obvious objection: would any existing code break if Erlang syntax were altered to allow this?

Probably not.  However, the parser is a tiny part of the Erlang system, even a
tiny part of the compiler.  Simplicity of the parser per se is hardly worth
worrying about.  What matters is having a readable language.

Even if the change were made, every Erlang book and paper ever printed shows
the existing style.  How would newbies react to code that looked like nothing
they had ever been taught about?  What would happen to Erlang language-sensitive
editing tools, would they cope with the new syntax?  I know that the crude
pretty-printer I wrote for Erlang would have trouble, and the commands I now
find useful for moving to the next/previous clause would cease to work.