Packages in Erlang: new documentation

Fri Sep 5 03:05:39 CEST 2003

I pointed out that apply/3 and spawn/3 and suchlike could be made
to work in the presence of module renaming.

This is a good time to point out that the longer I ponder this,
and I have been thinking about Erlang modules for about 7 years now,
the clearer it seems to me that

(1) Module names must be completely decoupled from file names.
    [This does not mean that you cannot have a _default_ mapping
    from module names to file names, only that you cannot have a
    _fixed_ mapping.]

(2) It is a bad idea to introduce names that don't actually name
    anything.

    Erlang modules are somewhat anomalous in the language.  There really
    are module "objects", with many properties, but there are not module
    "values".  If you don't believe that modules are objects in Erlang,
    take a look at the 'c' and 'code' modules, and you'll see what I mean.
    While you cannot ask a module *itself* for its properties, you can
    ask the Erlang system for the properties of a module.

    It is possible to have a system of hierarchical modules.  That's
    how Pop-2 does it, and that's how Ada does it, and that's how Mercury
    does it.  In those languages, modules may contain types, functions,
    _and other modules_.

    In contrast, this package proposal introduces a class of names
    that do not name anything.  Where are the operations to
        - list all loaded packages
        - list all available packages
        - list the contents of a package
        - load a package
        - delete a package
        - purge a packge
        - ensure that a package is replicated on another node
    and so on?

    These operations may exist, but I didn't see them in the document
    we were asked to look at.  If they do, I would be interested in
    reading a fuller version of the document.

(3) Having packages that do not package is unhelpful.

    Java is notorious for this:  packages are accessibility scopes,
    but anybody can sneak new classes and subpackages into a package
    without the package having any say about it.

    This is really a fundamental problem with Java packages, and the
    scheme under consideration is far too Java-like for my taste.

    If there are to be packages, then a package should be able to list
    its children and insist that no other purported children are legitimate.

(4) A satisfactory solution to the "package" problem must deal with the
    issues of
    - including a module at more than one place in the space of packages
    - including more than one version of a module in a complete system
    If you consider these issues, you discover that the one thing that
    a module must NOT do is to know its exact place in a hierarchy of
    packages.

(5) For large scale systems, the rules for binding modules together
    have to be outside the modules.

    For a long time SASL felt wrong to me:  I thought most of the
    information in SASL files belonged in the modules.  But then I
    studied LACE, and understood what problems the LACE design was
    trying to solve, and realised that I was wrong about where the
    information belonged.  It's also worth looking at the
    configuration/distribution language that was developed for MESA.

(6) Global name spaces are a pain for really large systems.
    The global process registry (ANY global process registry) is
    an idea whose time has gone.
    The global module registry (ANY global module registry, whether
    module names are simple or dotted) is also an idea whose time
    has gone.

    Yes, this means that apply/3 and spawn/3 and so on need rethinking.
    Now that Erlang has closures, there are alternatives...

(7) JAM was not the last word in Erlang implementations.
    BEAM will not be the last word in Erlang implementations.
    "BEAM does so-and-so in such-and-such a manner" may decide what
    designs are easy to experiment with atop BEAM, but must not be
    allowed to dictate which designs are considered at all.

"Vlad Dumitrescu" <vlad_dumitrescu@REDACTED> wrote:
	[about using a per-module renaming table]
	For the first, that would mean that every remote call will have
	to check first the internal module table for a rename.  That
	would make these calls much slower.

This is an assertion without any evidence to back it up.
The experience of Smalltalk has, I believe, provided evidence
that the assertion is untrue.

	Again, if a module name is sent as a function argument, the
	local name wouldn't make sense to any other module.

This is definitely a problem with the existing scheme that we are
discussing.  The only way around it that does not introduce 
inconsistencies into the language is to ban module name abbreviation.

Consider
	foo:bar()
and
	(X = foo, X:bar())
and
	(X = foo, Y = bar, apply(X, Y, []))
where foo is the local abbreviation of ick.ack.uck.foo

If these DON'T do the same thing, the language is inconsistent.
If they DO do the same thing, then apply/3 _has_ to use a per-module
renaming table somehow.

	Finally, what about functional objects?  The environment for a
	closure would need to include the rename table from the defining
	module.

I'm assuming standard terminology where "environment" refers to variable
bindings and "code" refers to the static data (including literals).  No,
the module context for a closure would _not_ be part of its environment,
it would be part of (the literals of) the code.

Again, either you do this or you introduce inconsistency into the language.
Consider

    foo:bar()

and

    (fun () -> foo:bar() end)()

If these DON'T do the same thing, the language is inconsistent.
If they DO do the same thing, then closures have to know how to do the
same renaming that the statically enclosing module would.

I am not proposing any new kind of renaming.  In these examples I am
only referring to the kind of local abbreviation in the *existing* scheme.

	These problems are solvable, but look like are imposing more
	restrictions for the programmer than they remove.

These problems *must* be solved if the *existing* scheme is not to result
in an inconsistent language.  The Simplest Thing That Could Possibly Work
is to ban _all_ module renaming and abbreviation.

	>In short, from the inside, there appears to be no significant difference
	>between a module name like foo_bar_baz and a module name like foo.bar.baz.

	Only that locally you can use baz:fun(...) instead of the full
	module name.

And we have now established that that kind of renaming EITHER yields
an inconsistent language OR requires the kind of per-module renaming
I mentioned before anyway.

	And also that maybe one wants/needs to rename the 'foo'
	application, and then not all module names have to be updated.

Actually, this is one of my main concerns, and it is one of the most
important reasons why I feel that ANY dotted name scheme is a bad idea.