user-defined operators

Tue Mar 30 08:17:28 CEST 2004

"Martin J. Logan" <mlogan@REDACTED> is still arguing,
though I'm not sure why.

	The definition of operator overloading is:  "Use of a single
	symbol to represent operators with different argument types" i.e
	one operator is overloaded to handle two strings or two integers
	irrespective of where the operator is defined or how many times
	it is defined separately.

That's not "THE" definition.  It may be YOUR definition.

By the definition I use, a symbol which is bound to a single definition
is not overloaded, no matter how many types that definition applies to.

Let me quote a famous compiler textbook (exercise for the reader:
which book?):

    p330:	An "overloaded" identifier can have a set of possible
		types.  [They really mean "signatures", and really.]
    p334:	A symbol that can represent different OPERATIONS in
		different contexts is said to be "overloaded".
		[This is the "multiple definitions in scope at once"
		 definition I used, and is different from the one on		
		 p330].  Overloading may be accompanies by coercion
		 of types.
    p344:	A DISTINCT NOTION from overloading is that of
		"polymorphism".

It invites confusion to talk about types in this context, because
Erlang doesn't have them.  A function like

    f(X, Y) when integer(X), integer(Y) -> X + Y;
    f(X, Y) when list(X), list(Y) -> X ++ Y.

is a *single* definition which reacts differently to different
run-time values, just like

    f(X, Y) when X >= 0, Y >= 0 -> X + Y;
    f(X, Y) when X < 0, Y < 0 -> X * Y.

is a single definition which reacts differently to different
run-time values.  In neither case is f a particularly easy function
to understand, but in neither case can you blame this on overloading,
because neither of these f/2 definitions is overloaded.

Note also that the fact that f is confusing is due solely to the
function it computes; it is NOT due to the fact that it's an operator.
(Because at the moment, it ISN'T an operator.)

Do I really have to stress this elementary point again?

 An operator which is confusing because it is bound to a function
 that has a confusing definition is not confusing because it is an
 operator but because the function that defines it is confusing;
 using it as an operator does not introduce ANY kind of polymorphism
 that is not already present or ANY kind of overloading that is not
 already present.

Adding user-defined operators to Erlang WOULD NOT AND COULD NOT
introduce any kind of polymorphism or overloading that Erlang doesn't
already have.  If you don't like those kinds of polymorphism or
overloading or witchcraft or whatever you call it, blame EXISTING
Erlang for that, not operators.

	It follows that if you have polymorphic functions and you base
	your user defined operators on those functions then you can do
	the same thing with operators as you can with functions, namely
	overload then to handle different "types" i.e perform a
	different operation with with two strings vs two integers etc...

Aside from the nonstandard use of terms here, yes.  However, the
important thing is that whether a name is used with backquotes or
without, it still must resolve to a single definition, and that
definition will *BE* a perfectly ordinary Erlang function.  If you
want to call the function bad names, go ahead, but that is no argument
against operators.  It might perhaps be an argument against a
dynamically typed language, but then, if you want Mercury, you know
where to find it.

	It is understood that user defined operators are orthogonal to
	operator overloading in strict sense.

Right.  So why all this nonsense about overloaded operators?

	Again though if you allow
	user defined operators to be based on polymorphic functions then
	you ARE allowing overloaded operators.

No.  Not at allo.  I repeat, "A symbol that can represent DIFFERENT
OPERATIONS" (famous book, page 344) is one that is overloaded.  A
symbol that can only represent one operation is not overloaded.

Nobody goes around slagging off Erlang for having "overloaded functions",
but it is the *functions* that are issue, not the operators.

	In my years of programming I have come to expect that one must
	read a function before understanding what it does is many cases.

Right.  With you 90%.  AND IT DOES NOT MATTER WHETHER THAT FUNCTION
IS CALLED USING OPERATOR SYNTAX OR NOT, you have to read the
function's documentation (ideally, NOT the function itself) to find
out what it is supposed to do.

	I typically do not research what the * operator does really
	thoroughly though.

Don't you?  Heck, it's the second thing I do when learning a new
language.  Step 1:  what are the basic data types.  Step 2:  what
are the built in operations.  I must have spent hours studying the
Fortran standard's built-in operators, ditto Erlang's.

	I simply don't want to see `*` in code that
	concatenates strings or multiplies two floats producing a
	rounded integer.

IT CAN'T HAPPEN.  The proposal we're discussing DOES NOT ALLOW
new definitions for existing operators.  Since there already *is*
a '*' operator, you cannot define a new one.  (In fact, with the
proposal as I have it, you couldn't do this for another reason:
`*` would not be a legal operator name in any case.)

	Erlang is readable and effective, adding user
	defined operators does little to increase the expressive power
	of the language and goes one step in decreasing its simplicity.

Having used several languages with user-defined operators, I have to
disagree strongly.  User-defined operators (especially alphabetic
ones) can dramatically improve the readability of code.  They have
absolutely no effect on the *semantic* simplicity of a language,
and only trivial effect on the *syntactic* simplicity (one extra
grammar rule).

I *also* admit that user-defined operators can be used to excess,
but what can't?

	Many people like Joes !! operator because it is more in keeping
	with the theme and syntax of erlang than a function call to do
	the same.

Well, Joe is the principal inventor of the language.  If anyone has
a coherent view of the essence of Erlang, he does.  Mortal that I am,
I dare to disagree.

I view !! as a wart.

Not because it is a new operator.  I _like_ new operators.

Not because it's not in keeping with the syntax of Erlang;
it is, although it is just a BIG a change to the syntax of
Erlang as adding user-defined operators would be, for much
less gain.

Not because it's overloaded, although for someone who attacks
user-defined operators in general on the (groundless) basis of
"overloading", it's rather inconsistent to praise "!!".  To
quote the "conceptual integrity" slides (slide 3), in A!!B
    A can be a Pid,
    or a string,
    or a list of Pids and strings.
and that is precisely the kind of overloading that Martin J. Logan
is so hot against.  It's quite a nasty kind, because strings *are*
lists.  (The slides hint that a string is a file name; the examples
suggest that it may be any URI.)  In fact there's a seriously weird
kind of URI, "erl://"++Node++"/"++Name !! X means to do Name!!X on Node.
We're told that "erl://www.sics.se/home/joe/foo" !! read should have the
same effect as "/home/joe/foo" !! read on www.sics.se; but the text at
the top of slide 7 suggests that it should have the same effect as
"home/joe/foo" !! read (note the absence of the initial slash).
As for using "/dev/spawn" !! Fun, this is overloading with a vengeance.

I seem to be a bit inconsistent myself.  Not really.  I *DON'T* say that
!! is overloaded.  I *do* say that it does radically different things
depending on the run-time type of its argument AND on the exact run-time
value of its argument ("/dev/spawn" !! X and "/dev/stdin" !! X do not
seem to have any meaningful X in common).  This means that if I see
(Provider !! Message) then I have no idea what it is about until I have
traced in some detail what the possible values of Provider and Message
might be.  If I can't tell whether it is is spawning a new process,
sending a message to an existing process, doing an FTP access, deleting
a file, or firing an intercontinental missing without detailed data
flow analysis, what are the odds that any plausible tool will help me?

(Since the HTTP protocol does have PUT (RFC 2068 section 9.6), I don't
know why slide 9 says '"http://..." read only files'.  Nor is it clear
to me how "/dev/videoplayer" is supposed to be hooked up to showmetv.)

	They do not advocate it just because it is an operator.

Nobody ever said they did.

The question was WHAT IS THE SMALLEST CHANGE TO ERLANG THAT WOULD
ALLOW PEOPLE TO EXPERIMENT WITH NEW OPERATORS WITHOUT HAVING TO HACK
THE COMPILER SOURCES.

For what it's worth, I don't feel queasy about !! because it is an
operator, but because the definition basically boils down to
"something does something somewhere with something", which isn't
very helpful.

To get an idea of just *how* unpredictable "!!" can be, look at
slide 11.  There we see that something that _looks_ like a file name
can actually be registered as doing something else, so just because
on one rare occasion I can see
	"/dev/null" !! {write,X}
I *still* have no idea what it does, because "/dev/null" could be
the registered name of a process that interprets {write,X} as
"send X as spam to 240 million people".

In short, the claimn on slide 12 "easier to understand" does not seem
to me to be justified.  Not because of what people *might* do with !!,
but because of what the designer *has* done with !! in those very slides.

	If you want to restrict !! to `rpc` then all you are
	really arguing for is to add a strangely restrictive infix
	erlang function calling syntax.  I can't figure out how that
	benefits anyone.

Then I guess you haven't been paying attention.  First off, there is
nothing "strangely restrictive" here.  Erlang operators all have one
or two arguments.  Therefore if a function is to be called using operator
syntax, it must have one or two arguments.  (Someone else had the idea
of allowing prefix unary user-defined operators.  It makes sense; it's
easy to handle; let's do it.)  That is, it is not in the least
restrictive for functions where you would *want* to use it.

Most important, the argument is that user-defined operators have
PROVEN THEMSELVES IN PRACTICE in several languages.  Erlang is almost
alone amongst functional languages in not allowing them.  They really
do provide a welcome increase in readability *some of the time*.

	We are trained through the mathematics we took
	in school to use non-alphanumeric symbols as operators and that
	is just what we shall do when we get to make up our own.

I'm sorry that you think Erlang is so bad.  Look at the following
list of operators, culled from the 5.3 reference manual:

    Familiar from school:
	= < > + -
    Not familiar from school:
	* (should be x) / (should be -:-)
	== /= =< >= =:= =/= ++ -- : # !
    Not familiar from school and alphabetic:
	bnot div rem band bor bxor bsl bsr
	not and or xor orelse andalso catch
5+13+15
If familiarity from school were a tolerable criterion for what should
be allowed as an operator, only 5 of Erlang's 33 operators would be
allowed.

If "non-alphanumeric symbols" were a tolerable criterion for what
should be allowed as an operator, nearly half (15) of Erlang's
existing operators would be forbidden.

In short, this diktat is definitely NOT "in keeping with the theme
and syntax of Erlang".  The Erlang designers rejected it entirely,
as do I.

	If I want to deal with refactoring peoples code that looks like
	a bunch of cryptic symbols between strings and numbers where the
	symbols do different things depending, then ... well I don't
	want to deal with code like that so I will put my vote in for
	only having to deal with cryptic functions from time to time and
	for keeping things just the way they are.

I can only characterise this as scaremongering.
The proposal I'm arguing for *FORBIDS* "cryptic symbols" as operator
names.   And operators would not and could not "do different things
depending" to any greater extent than Erlang functions already can
and do.  Allowing alphameric function names to be used as operators
wouldn't make Erlang in no way harder to read or maintain, and it
really is *not* good enough to make wild statements saying that they
could.

I repeat:

> I'm tired of scaremongering about how bad things *could* get, when
> Haskell and Fortran experience shows that they *don't* get bad, and
> when they *can't* get bad in some of the alleged ways.