[erlang-questions] Bug ?!

Thu Oct 5 02:08:22 CEST 2006

Richard Carlsson <richardc@REDACTED> wrote:
	I agree that [(n+k) patterns] are very cute. Probably too cute.
	For a detailed discussion by the Haskell nomenklatura:
	   http://www.cs.chalmers.se/~rjmh/Haskell/Messages/Decision.cgi?id=203

If you do see that, note in particular the first two paragraphs:
    "we have an extra reason not to change the (n+k) syntax"
    "n+k patterns will stay"
    "they should either be in or out and they're in!na

If we look at Colin Runciman's arguments, for example:
(0) + isn't a constructor -- this is not just an argument against n+k,
    it is an argument against abstract patterns or views of any kind.
    (n+k isn't quite definable in my abstract patterns proposal, but n+1 is.)
    This could be fixed by using some symbol other than +; if the
    syntax were (for argument's sake) 1'(x) or 2'(x) instead of
    x+1 or x+2, we could also allow 1'(x) or 2'(x) as a construction
    form in expressions (with the condition that x >= 0).

(1) the test n>=0 is a hidden side condition -- the side condition is
    precisely why we WANT this pattern.
    I believe this could be fixed by the same approach as (0):
    if there were a quasi-constructor k' for each positive integer
    k, which raised an exception of its argument were not a natural
    number, then the "side-condition" could no longer be honestly
    called "hidden", because it would apply always and everywhere.

(2) it is forbidden in a top-level pattern -- well, it shouldn't be.
    This argument doesn't apply at all to Erlang, where function definitions
    are not top-level patterns; there aren't any of those.

(3) other patterns can be made more specific by putting a constant
    for a variable but while n+2 is legal, 2+2 isn't -- fix that by
    making 2+2 legal.  He also argues that n+2 is legal but n+m is not;
    if we used k' as a quasi-constructor, where the ' is *part of the token*,
    that problem also would disappear.  It would be no stranger that
    2'(n) cannot be generalised to m'(n) than that F2(n) cannot be
    generalised to (F m)(n).

(4) n+k looks like an aid to equational reasoning but isn't -- true,
    but it is hard to take that argument seriously:  does anyone expect
    it to be?  This argument would hold no force whatever if the ' syntax
    were used.
        "Suppose f is defined by the sole equation f 1'(n) = 0.
	 Is this a law: for all integer expressions n, f '(n) = 0?"
    Not quite, but it IS a law for all expressions n with natural number	 
    values.  The range condition here is no more surprising or exceptional
    than "if f is defined by f n = 0" it is NOT a law that
    "for all integer expressions n, f n = 0", not in a strict language like
    Haskell it isn't, because the expression might fail to have a value.

(5) "Superficially, it encourages careful case analysis, but again it
    undermines it" -- turns out to be an attack on n+k syntax FOR DOING
    ITS JOB.  The job of n+k syntax is to *NOT* handle negative numbers.

(6) Why not c*n+k patterns, where 0 <= k < c -- but that is precisely
    what I asked for in my previous message.

(7) Uses of n+k in existing code are easily detected, and easily
    translated into (n+k)-free equivalents -- half true.  Yes, n+k
    is easy to detect.  No, a compiler can "easily" translate it away,
    but a human cannot "easily" transform all occurrences of n+k into
    *READABLE* equivalents.  That's why I want N+k in Erlang, so that
    code can be written in a safer (in cases where you would have forgotten
    the range check) and more readable (in cases where you would not) way.
    This is an argument that n+k *can* be eliminated, not that it *should*.

(8) Omitting n+k would simplify the language -- and so would eliminating
    if-then-else, short-circuit and, short-circuit or, 'where', and a host
    of other bits of syntactic sugar.  Simplicity is not an absolute goal.
    (There are some truly bizarre things in GHC, some of them seriously
    considered for Haskell-prime.)

(9) Retaining n+k would be controversial -- so would omitting it.
    So this doesn't really count as an argument against it.

Those are reasonably typical arguments, and Richard Bird did a superb
job of demolishing them.  For example,
    "(0-3) The anomaly remarks abotu + not being a constructor are all,
     of course, true.  But in 14 years of teaching, I have never found
     that they cause students any confusion."

David Lester's argument that the use of n+k patterns limits your code
to integral types, when some more general arithmetic type might have
been usable otherwise, is true enough for Haskell, but totally irrelevant
to Erlang.  Even for Haskell, the argument is bogus, because n+k COULD
be made to work for any instance of Num.  Here's how:

    define (n+k) to match v when
      signum (v - fromInteger k) /= signum (fromInteger (-1) `asTypeOf` v)
      n matches v - fromInteger k

Let's suppose we want to answer arguments (0), (1), (2 does not apply 
in Erlang), (3), arguably (4), (5), part of (6) that I omitted, but
in a design for Erlang.

    For any positive integer literal k,
	#+k is a postfix unary operator
	#-k is a postfix unary operator
    these operators are "quasi-constructors".
    That is, they act like constructors, and may be used in expressions
    and in patterns, but unlike list and tuple constructors (but like
    binary constructors) their arguments are restricted.

    In an expression, the (left) argument
    - of #+k must evaluate to a non-negative integer x,
      the result is x+k;
    - of #-k must evaluate to a non-positive integer x,
      the result is x-k.

    In a pattern, the (left) argument
    - of #+k must be
	a non-negative integer literal x, matching (x+k), or
	a variable bound to a non-negative integer x, matching (x+k), or
	an unbound variable or wild-card X, in which case the pattern
        matches any integer y greater than or equal to k and binds
        X to y-k;
    - of #-k must be
       a non-positive integer literal x, matching (x-k), or
       a variable bound ro a non-negative integer x, matching (x-k), or
       an unbound variable or wild-card X, in which case the pattern
       matches any integer y less than or equal to -k and binds
       X to y+k.

    X#+k matches e#+k if and only if integer(e), e >= 0, X = e
    X#-k matches e#-k if and only if integer(e), e =< 0, X = e

Now we can do a three-way analysis:

    factorial(N = M#+1) -> N * factorial(M);
    factorial(0) -> 1;
    factorial(_#-1) -> exit({badarith,[{?MODULE,factorial,1}]}).

	> But also, when working with binary, ternary, and quaternary trees,
	> I have wanted to do this:
	> 
	>     f(1) -> ...;
	>     f(2*K+0) -> ...;
	>     f(2*K+1) -> ....

	It is really very pretty. One almost wants to pick it up and cuddle it. 
	But it's probably the kind of feature that is catering too much for a 
	special case while at the same time not really being useful all that 
	often in real code.

Oddly enough "we don't provide this obviously useful generalisation of n+k"
is one of the standard arguments brought out by people who oppose n+k.
There seem to be quite a few people who regard this as a natural extension
of pattern matching, even if they see it as undesirable on deeper reflection.

Let me put it this way:  Lisp has been around for decades without any kind
of pattern matching at all.  Get used to the Lisp style, and you find
yourself thinking "all this pattern matching is catering too much for a
few special cases while at the same time not really being useful all that
often in real code, what with sequence functions and CLOS and all sorts of
things you can't write patterns for".  If we _had_ c*n+k patterns we might
well find them useful a lot in real code.  Maybe not.  I *do* know that I
use n+k patterns a lot in Haskell and wish for them a lot in Erlang, 

In the same way, the more I look at
    let P = E0 then E1, ...
    for Q <- ...
    in Result
the more I find uses for it and the more I find that it would simplify and
clarify existing code.  (I managed to shrink one chunk of someone else's
Erlang code by nearly a factor of three this way.)  What you find useful
depends on what you have available to use.