Structs (was RE: Record selectors)

Wed Jan 15 10:39:07 CET 2003

> Joe Armstrong <joe@REDACTED> wrote:
> 
> > [...]
> >   With structs you can write things like:
> > 
> > 	A = ~{name="joe", footSize=42},   %% define a struct
> > 	A.footSize,                       %% access a field
> > 	B = ~A{likes="motorbikes"},       %% add a new field
> > 	~{likes=X, name=N} = B}           %% pattern match etc.
> > 
> >   *without* any record defs.
> 
> Hm, OK, I've read the paper, let me see if I've got this straight.
> 
> Structs are basically dictionaries a la the dict module, except:
> 
> 1) you can pattern-match on their contents
> 2) they have their own special syntactic sugar
> 3) they're implemented directly in the VM for performance
> 
> Well, #1 is a great advantage over dicts, of course.
> 
> #2 is a mixed blessing - A.footSize is easier to read and to type out than
> dict:fetch(A, footSize), but it's also rather unorthogonal, symbolspace
> is getting more and more crowded, "~" is easy to confuse with "-" in many
> fonts, plus I personally think tildes are kind of ugly :)
> 

  The problem is more due to the dot than the tilde :-)

  . means 

	- end of a function or attribute
	- the separator in a floating point number
	- separator in a structured module name
        - record separator

  That's why you need a tilde or hash to help the parser (and the human)

	~ means here comes a struct
	# means here comes a record 

> #3 is arbitrary (there's no reason dicts or any other data type couldn't
> be implemented in the VM too.)
> 

  It's not really *arbitrary* - this  is one of the most tricky design
decisions there is.

  The  choices of  what you  do  in compact  syntax and  what you  do
efficiently in the VM are crucial.

  The "send" operation is written

	A ! B

  And compiles down to a single Op  code in the VM - also a great deal
of effort  has gone into  the implementation of  send. This is  not an
arbitrary decision but something that is at the very heart of language
design.

  If the user had to write.

	send_message(A, B)

  And if the *implementation* of send  was slow - then Erlang would be
useless as a concurrent PL.

  Language design *is* (in a sense) choosing which operations should be
expressed by succinct syntax and efficiently implemented.

> Also, since it's one of the considerations that sparked this thread, it's
> worth noting that structs are in no way safer that records; in fact,
> they're so flexible that they're arguably less safe.  Records have
> rigid structure, while structs, like dicts, are just random bags of data.

  Yes   -  but  structs   do  fit   nicely  with   the  rest   of  the
language. Records do  not have the Erlang "feel" -  they are too rigid
for my liking - and the mess with include files is very non-Erlang.

  You will introduce  new errors by typos in  struct member names, but
this is no  worse than mis-spelling an atom in  any other context, and
while mis-spelt atom names *is* a problem it is not a *big* problem.

  I routinely run  all my code through "coverage" -  and this picks up
virtually  all errors  due to  mis-spelt atom  names. <aside>everybody
should do this</aside>

> So, I don't want to sound too negative, but honestly, I'm less than
> thrilled by the idea.
> 
> But, that's possibly because the world I live and program in puts an
> emphasis on validation.  I still think you can get more "bang for your
> buck" with "objects" - if you write a module for each data type you use,
> you have full control over the interface, you can screen out bad data
> before it gets into the aggregate, you can use whatever storage scheme you
> like (and change it at a later date,) you have a convenient place to group
> all the applicable functions, etc etc, at a base cost surely not *too*
> much greater than using structs...

  My view is that data validation *only* occurs at a human -> computer
interface and  at a boundary  where two components communicate  via. a
protocol and where the components are not trusted.  For this something
like my UBF  system http://www.sics.se/~joe/ubf seems to be  a step in
the right  direction. UBF  + timing +  (other non functional  stuff) +
invariants would seem to be the right way to go.

  Basically you  validate data  (as hard you  can) when it  enters the
system.  Thereafter,  and internally there  should be no  validation -
just  let things  crash  and  design the  error  recovery through  the
application of carefully chosen invariants.

> 
> Forgive me, but is Erlang really so object-shy that encapsulation is
> something that we feel we can afford to avoid simply because "they" have
> turned it into a meaningless buzzword?
> 

  No encapsulation is fine - but objects (as in OO languages) do a lot
more than just encapsulate things.

/Joe

> -Chris
>