Record selectors

Tue Jan 7 14:51:27 CET 2003

On Sun, 5 Jan 2003 11:43:37 +0100
"Daniel Dudley" <daniel.dudley@REDACTED> wrote:

> This thread has generated some interesting replies, but I'm
> left wondering whether I have been entirely understood.
> Here's a sample test module, which I hope will help clarify
> my original message:
> [...]
> This example illustrates that not only is the reference to
> a specific record-type in the record selector syntax
> superflous, it is downright "dangerous" because it invites
> error-prone code.

Yep, it *is* dangerous, in the same way as omitting bounds-checking on
fixed-length arrays is dangerous.

It is not, however, superfluous (from the compiler's point of view.)  The
compiler has to know which record you mean, in order to generate efficient
code.  If it isn't told which record you mean, it would have to leave that
decision up to the runtime, which would result in less efficient code.

In essence, when you use records in Erlang, you're trading off safety for
performance (just like when you omit bounds-checking in some other
languages.)  If this is a conscious, intentional tradeoff, there's nothing
really wrong with it - assuming the programmers who are using your code
can be trusted to have a certain minimum level of discipline.

Another way to put this might be that, while Erlang is a dynamically-typed
language, it currently uses a record mechanism taken straight from the
pages of a textbook on statically-typed languages :)  So, it's awkward. 
It is, however, a slightly nicer syntax than saying:

  -define(COUNTRY_NAME_POS, 3).
  ...
  MyCountryName = element(?COUNTRY_NAME_POS, MyCountry).

> The bottom line is that a record selector is tied to the
> type of value stored in the variable being investigated. If
> this type is (as expected) a tuple, then the actual record
> type is (must be) determined by reading the first element
> of the tuple. Only then can one decide which of the
> remaining elements in the tuple corresponds to the
> specified field.

In regards to safety, your point is *very* valid.

The only thing I can add to it is that IMO it would be better if a new
mechanism to enable such be added to the language, rather than changing
the existing implementation of records and possibly breaking programs
which have made performance assumptions about them.

Actually, thinking about it a bit, a nifty trick could be to have the
first element, instead of being an atom, be a fun that maps field names to
positions.  Like:

  CountryTag = fun
    (code) -> 2;
    (name) -> 3
  end,
  Country = {CountryTag, 47, "Norway"}.

This way, 'X#Y' would compile to

  element(element(1, X)(Y), X)

While this provides a simple and inexpensive (I think) way for the runtime
to decide which field goes where, it still doesn't address the more
important issue that 'country' and 'tuple' are overlapping types.  It
would be much better if country could be a real, unique, opaque type, that
couldn't be confused with any other type.  It's all well and good to say
that the representation of some data type (like those exported by the dict
module etc) is undefined, but if that means that one could unknowingly
write a pattern that accidentally matches it, it's still not very safe :)

-Chris