Enhanced type guard syntax]

Thu Sep 18 22:23:45 CEST 2003

Valentin wrote:
> For example, when we say ONE, we can write it as "1" or
> "one" -- it wouldn't change the meaning, right? 

<aside note>
Er... I wouldn't consider that Good Style (an interface shouldn't accept 
inputs that are so ill-defined as natural language), but of course there 
are similar and worse interface in practice. (Heck, I have been forced 
to do some interfaces of this kind myself.)
For example, if a library has lived long enough, it will accept inputs 
of "modern" and of "ancient" style, which tend to be quite different 
data structures.
</aside note>

I think two things are getting conflated here: guards and preconditions.

A guard is a kind of selector: the first implementation that has guards 
that match a given set of parameters is chosen. (In OO, all guards are 
on "typeof (first argument) <= typeof (first parameter)", so, in a 
sense, Erlang is indeed OO, but not on the message level *grin*.)
If a given function has the wrong guards, the system will try another 
one, it will not fail.

A precondition is unrelated to implementation. If the precondition 
fails, the caller has done something evil. This is things like calling 
sqrt with a negative argument, or, say, bank transfers with identical 
source and destination accounts. Any routine added that has a matching 
guard is simply wrong, it violates the contract that's supposed to go 
with the function of that description.

Preconditions are valuable. Both as a kind of built-in unit test, and as 
a documentation aid.
For testing, have the system compile the stuff with precondition 
checking enabled. For extra convenience, have the system compile them in 
anyway, and have them activated and deactivated by the run-time system. 
For generous convenience, have them activated and deactivated on a 
per-module or even per-function basis :-) (oh, and activate and 
deactivate them in the *calling* places, you usually want to debug the 
caller if you're checking precondition violations *g*)
For documentation, it's simply both more precise and concise to have
   sqrt (X >= 0.0)
instead of the verbose and natural-language all-too-often-fuzzy
   sqrt (X)
   %% X must be >= 0.0

Syntax ideas (just thinking loudly):

Ideally, one would use something like
   function (X/guard)
where "guard" is a fun() that accepts a single parameter, and returns 
True or False.
For example, this would allow one to write stuff like
   is_real (X) and X >= 0.0
Unfortunately, taking out the X and getting this into a fun() is going 
to be *very* ugly. Too much syntactic cruft around here.
Haskell solves the issue far more elegantly IMHO: any "incomplete" 
expression (which is an expression missing its last parameter) is 
automatically a fun(). The above as a fun() would then look like this:
   function (X / is_real and ('>=' 0))
where the constituents are:
'>=' is the ordinary >= operator, just as a function of two parameters 
(i.e. you'd use this as '>=' 0 X to mean "0 >= X"); this is just 
syntactic sugar to convert the >= symbol from an operator into a 
function name.
'>=' is a function of two parameters. In the above, the first parameter 
is given, yielding the function ('>=' 0), which is the function that 
returns True if its only parameter is greater than or equal to zero.
is_real is just the name of a function, a one-parameter function that 
takes a single parameter.
"and" is actually and/3, not the ordinary and/2. It take two 
single-parameter functions and a value; when called, it feeds the value 
to the functions and returns the logical AND of the two function result. 
(I'm not sure whether the compiler can be made to recognize that the 
"and" in the "is_real and ('>=' 0)" bit actually is and/3. Maybe some 
syntactic sugar is required. Well, I'm just thinking loudly anyway, 
without any claims to relevance *g*.)

Hmm... no, it won't do. Erlang identifies functions by name and 
parameter count. If I write "foo X", the compiler expects it to be a 
call to foo/1, not a scantily clad foo/2 missing its second parameter. 
We'd need something like an extra "dummy" parameter, something like
"foo X _".
In the above, this would be
   function (X / is_real (_) and ('>=' 0 _))
or even
   function (X / is_real (_) and 0 <= _)
which isn't much of an advantage over
   function (X / is_real (X) and 0 <= X)
but remember that parameter names can be *much* longer than a single 
character :-)

Anyway, on to another aspect: guards that involve multiple parameters.
For a contrived example, assume
   distribute_percentages (X, Y, Z)
   %% X + Y + Z must sum up to 100.
I'd prefer to write this as
   distribute_percentages (
     X / in_range (0, 100, _),
     Y / in_range (0, 100, _),
     Z / in_range (0, 100, _) and X + Y + Z = 100.0)
(See how the formalization forced me to become quite explicit about what 
a "percentage is?)

Just my 2c.

Regards,
Jo