Structs - thoughts on implementation

Mon Jan 20 00:51:20 CET 2003

Hi Robert

> Lawrie Brown wrote:
> >On Tue, Jan 14, 2003 at 10:41:44AM +0100, Joe Armstrong wrote:
> >and a statement like: E = ~C{hates="skiing"} results in:
> >    E = {guru, dynamic_struct, hates, "skiing", likes, "water rockets", 
> >    name, "robert"}
> Actually I like skiing. :-)

... Sorry - first thing off the top of my head ;-)

> I wonder if this will be fast enough. If we are really serious about 
> structs then we should them as a replacement for records sometime in the 
> future, and this would be logN time with at least one atom comparison. 
> When we introduced records users were very worried that they would be 
> slower than using records straight off. It took some convincing.

I'm not disagreeing with any of the above, but frankly I can't think of any
way to allow dynamically varying structures at run-time without having some
tradeoff like the above.

However for my 'static_struct' variant, if you used a struct meta-command
as someone (sorry forgot who) suggested, something perhaps like (haven't
thought this one out fully yet):

    -struct(guru, static_struct, {hates, likes, name}).

you could then have the compiler optimise to use fixed offsets into the
tuple for any constant field references. You could also extend the
above syntax in some manner to provide for default field values for
those not explicitly specified in the assignment.

I also liked the idea of some type of struct import at runtime (perhaps
in the user attribute table or similar for a module), but you'd then
still have to have some sort of runtime initialisation stage to cache
the offsets used. My inital thoughts on how ... you'd have the compiler
build a table that for each struct used in the module, lists all field
names used along with offset to use when known (say 0 default), and the
initial field value if defined. Then at runtime, when the module is
loaded you also have to load any module it imports struct info from,
and have the code check to see if the struct is static, and if so fill
in the offsets to use for each of the field names used in this module.
And also fill in the defaults regardless. 

At this point you can also check that the field names used if its
static, and throw an error if they don't match (as in the fields
used for any given struct in a module must be a subset of all
fields defined by the import, witht eh rest taking their default
values).

Any code in the module that needs to access a struct field would then
just access the relevant entry in this table (directly if its a
constant ref eg ~C{likes="motorbikes"}) and either use the saved
offset, or realise it must do a search.

If its a dynamic struct, then you're forced to do a runtime search
for the field.

> How would you handle anonymous structs?

The same as I think Joe suggested in one of his emails - just use a
reserved name tag eg anonymous or perhaps [] (which was I think Joe's
suggestion), which would be treated specially, and explicitly flag that
no match on the record name was required. And you might want alternate
forms of the BIFs below (or whatever they evolve into) that don't
specify the structure name explicitly, which then implies anonymous
(and indeed would be then be essentially identical to the existing
tuple BIFs just with atom instead of numeric field identifiers).

> >In terms of run-time support, I lean towards extending/overloading some of
> >the tuple BIFs thus:
...
> > + element(Var, Type, Tag) - checks Var is a struct of specified Type,
...
> > + setelement(Var, Type, Tag, Value) - checks Var is a struct of specified
...
> > + size(Var, Type) - checks Var is a struct of specified Type, and returns
...
> > + struct(Var, Type) - a recognizer for a struct of specified Type (cf 
... 
> It would be an relatively easy way of properly introducing structs into 
> the language, not just as pass to the compiler. I am a bit wary of 
> overloading BIFs too much. I know I am guilty of doing it in the past 
> but it was probably wrong in many cases.

Hmmm - I must admit that within reason, I actually rather like overloading.
When the meaning is essentially similar it saves inventing multiple
nearly identical names just to satisfy a fetish that every name must be
distinct.

And in the specific cases here, to me the intent of the BIFs is
essentially the same - eg. get/set a field in "collection" data type
(vis tuple or struct) etc. Overloading feels rather natural and eases
the learning load.

Cheers
Lawrie

------------------------------------ <*> ------------------------------------
Post: Dr Lawrie Brown, Computer Science, UNSW@REDACTED, Canberra 2600 Australia
Phone: 02 6268 8816    Fax: 02 6268 8581    Web: http://www.adfa.edu.au/~lpb/