Packages in Erlang: new documentation

Thu Sep 4 07:54:20 CEST 2003

	On Wed, 3 Sep 2003, WILLIAMS Dominic wrote:

	> While I am at it, and before the Erlang community commits to the
	> concept of packages, I would like to argue in favour of Eiffel's
	> concept of a posteriori name conflict resolution by renaming.

I take it that he is referring to something like

    -name_module(snark, remote.module.name.called.boojum).

after which the module can refer to snark:start() and start a boojum.

	Yes, but it does not work in Erlang, because there your _cannot_
	rename a module, since you cannot know if someone is going to
	call it via apply(M, F, [...]) or spawn(M, F, [...]). It is not
	possible to know in general what the M:s are, or even to which
	M the caller is actually referring (the first M, or the later added
	one that was renamed?)

Clearly I must have misunderstood *someone*, because to make what I
thought was proposed work, all you have to do is have a local table
of renamings in the module, and then have apply/3 and spawn/3 and so
on look in that table.

I note that Haskell has moved from a flat module namespace to a
hierarchical module namespace comparatively recently.
Ada 95 introduced nested packages to Ada, with some interesting and
complex visibility rules.

I note that Common Lisp moved in the other direction:  the original package
design had nested packages, but the one adopted for the standard didn't.
Similarly, the original book on SETL had a complex nested package system,
but that was later dropped.

>From a survey of languages, one finds that dotted namespaces are a
popular idea but seem to be surprisingly complex in practice; frankly
they are something that should NEVER be released until there has been
quite a lot of experimentation (and ideally a formal specification).
One of the best things David Warren ever did for Quintus was to keep
on saying NO to the module system designs that others (principally
David Bowen) came up with until finally there was one that was simple
enough.

There are a number of things about that web page that bother me.

The first is that while the noun "package" is used often, there
doesn't seem to be any referent for it.  There are module objects,
and there are name objects (the "case X of foo.bar.baz ->" example
suggests that dotted names are just symbols).  But there doesn't
seem to be any *Erlang* object for a "package" to refer to.  Packages
appear to have no properties and enter into no relationships; there
are really only package *names*.

In short, from the inside, there appears to be no significant difference
between a module name like foo_bar_baz and a module name like foo.bar.baz.

That's the first thing that bothers me:  language that confuses me by
talking about packages when there are only dotted module names.

The second is that there is a fixed relationship between a module
name and a set of places to look for it.  This has proven to be a real
pain in Java, where you want a development version of a package and
a stable version, but because they are versions of the same package,
they get the same mapping onto directories.

Amongst other extremely nasty consequences, as far as Erlang is
concerned, 'forx' and 'forX' are two different module names,
but on some file systems (Windows) they must map to the same file
name.  There's another nasty consequence.  Consider the following
example:

    1> c('*foo*').
    {ok,'*foo*'}
    2> l('*foo*').
    {module,'*foo*'}
    3> '*foo*':f().

where '*foo*.erl' is

    -module('*erl*').
    -export([f/0]).
    f() -> 27.

Erlang is one language, with one set of rules about what can be a symbol.
ANY symbol can in principle be used as a module name.  But there are
several file systems:  UNIX has one set of rules for what can be used
in a file name ('/' may not, '\0' may not, but any other character may
be), Windows has another set of rules, RiscOS has yet another set of
rules, other operating systems have yet other rules.  Since the set of
operating systems is open-ended, if you insist on mapping Erlang module
names directly to file names, NOBODY CAN EVER KNOW WHAT THE SET OF
PORTABLE ERLANG MODULE NAMES IS!  Nobody can even know how long an
Erlang module name may be:  is the limit 8 or 10 or 27 or 251 or what?

The only programming languages I've seen where this problem is
satisfactorily addressed are Common Lisp (the "defsystem" facility,
although it didn't make it into the standard) and Eiffel (the "LACE"
facility, which I _hope_ will make it into the ECMA standard for Eiffel).

Let me put this as bluntly as I possibly can:
    While there may be a default mapping from Erlang module names
    to files, it must be possible for someone installing a module
    or package of modules to put each file exactly where s/he wants
    without _any_ constraint on how files are named.

Amongst other things, as soon as you step outside ASCII, it's entirely
possible that the file system used by the author of a module will encode
characters one way (Latin-1, say) while the file system used by the
installed may encode characters another way (UTF-8, say).

There are a couple of other things that bother me, but those are the
big ones.  I've always regarded the mapping from module names to file
names as a short-term makeshift, and with the introduction of dotted
names the idea of a fixed mapping from module names to file names is
now very definitely an idea whose time has *GONE*.  (The possible
loopholes caused by a search path make me queasy.  That's an idea whose
time has gone too.)