New trading systems platform
Richard A. O'Keefe
ok@REDACTED
Mon Jul 11 00:11:54 CEST 2005
I asked a trick question:
> What should {1,2} + {10,20,30} do, and why?
and James Hague <james.hague@REDACTED> fell RIGHT into the trap.
It should exit with a badarith code. Why? There's no clear meaning
to adding vectors of different lengths. (I tried it in J and got back
"length error."). Likewise, {1,2} + {1,two} should also result in
badarith.
But there is another, considerably more popular, array language in which
applying a binary operation to vectors of different lengths IS defined,
and for excellent reasons:
> c(1,2) + c(10,20,30)
[1] 11 22 31
I am not going to say "this one is right, that one is wrong", my point
is that there is no *OBVIOUS* answer and it takes a great deal of hard
thinking to produce a real design. It's NOT just a matter of hacking
on the VM.
Oh yes, and {1,2} + {1,two} tells us that there is a question about what,
precisely, the type error should complain about. Should it complain about
(2, two) or should it complain about ({1,2}, {1,two})? You can argue this
one either way. We need a coherent principle (or small set of such
principles) which will let us decide such questions consistently.
True, I can understand that point. But at the same time, with
test-driven development, I don't see it as any different than other
issues caused by dynamic typing.
My point here is that it smashes a powerful new debugging tool for Erlang,
the type inference program we've been hearing about recently. Adding this
feature is NOT just a matter of hacking on the VM, it would require
serious work on the type inference program and other high-powered tools.
I didn't say that it was any different from other issues (although I would
be prepared to argue that). What I said meant that it ADDS to other issues.
There's a famous paradox in philosophy:
one grain of sand is not a heap.
adding a grain of sand to a bunch of sand is obviously too small
a change to convert a non-heap into a heap.
yet if you keep on adding grains of sand, eventually you DO have a heap.
An addition could perfectly well be similar in kind to other things in a
non-heap, and you might think that making one more change won't convert a
non-heap language into a useless-heap language, but if you KEEP making such
changes, a useless-heap is what you will get.
I still have unpleasant memories of PL/I, where you could add just about
anything to just about anything, whether it made sense or not.
Take for example (1+'2'). There obviously *is* a number inside that quoted
atom, so why *not* let it be extracted? (Would you get 3 or '3'?) If a
binary happens to be the term_to_binary() representation of a number, or
of a tuple with numbers inside it, &c, why *not* hack on the VM so that
it automatically tried binary_to_term() any time that a binary as such made
no sense?
We have to draw the line *somewhere*, and what we have now is tolerably
coherent.
> It might, for example, be better to introduce a whole new "array" data type;
> that would be much more work, but it could yield better performance (using
> long-known techniques from APL) without sacrificing any of the run-time
> type checking we now have.
Strictly from a selfish point of view based on the kind of
applications I work on, I'd like to see "array of float" as a
fundamental type. Floats are individually heap allocated, so there's
a big win to putting them in a homogeneous array (OCaml has taken a
similar route).
It's also what Squeak Smalltalk has done, and if I've understood correctly
the GHC Haskell compiler supports this in a rather clever way. (And of
course Clean has supported for *ages* without having to be clever about it.)
But from a conceptual point of view,
arrays and tuples are the same thing, so why split them up?
Because from a conceptual point of view arrays and tuples *AREN'T* the
same thing. Valid analogues are array:list and record:tuple. (As you
may have noticed, Erlang -records *are* tuples.) Sure, some tuples may
*happen* to have fields that are all the same type, but that's not usual,
just as C structs may happen to have fields all of the same type, but
usually don't.
There are oodles of array operations that make no sense at all on tuples;
it could be nice having a data type which supported those operations.
More information about the erlang-questions
mailing list