Packages in Erlang: new documentation
Richard A. O'Keefe
ok@REDACTED
Fri Sep 5 03:05:39 CEST 2003
I pointed out that apply/3 and spawn/3 and suchlike could be made
to work in the presence of module renaming.
This is a good time to point out that the longer I ponder this,
and I have been thinking about Erlang modules for about 7 years now,
the clearer it seems to me that
(1) Module names must be completely decoupled from file names.
[This does not mean that you cannot have a _default_ mapping
from module names to file names, only that you cannot have a
_fixed_ mapping.]
(2) It is a bad idea to introduce names that don't actually name
anything.
Erlang modules are somewhat anomalous in the language. There really
are module "objects", with many properties, but there are not module
"values". If you don't believe that modules are objects in Erlang,
take a look at the 'c' and 'code' modules, and you'll see what I mean.
While you cannot ask a module *itself* for its properties, you can
ask the Erlang system for the properties of a module.
It is possible to have a system of hierarchical modules. That's
how Pop-2 does it, and that's how Ada does it, and that's how Mercury
does it. In those languages, modules may contain types, functions,
_and other modules_.
In contrast, this package proposal introduces a class of names
that do not name anything. Where are the operations to
- list all loaded packages
- list all available packages
- list the contents of a package
- load a package
- delete a package
- purge a packge
- ensure that a package is replicated on another node
and so on?
These operations may exist, but I didn't see them in the document
we were asked to look at. If they do, I would be interested in
reading a fuller version of the document.
(3) Having packages that do not package is unhelpful.
Java is notorious for this: packages are accessibility scopes,
but anybody can sneak new classes and subpackages into a package
without the package having any say about it.
This is really a fundamental problem with Java packages, and the
scheme under consideration is far too Java-like for my taste.
If there are to be packages, then a package should be able to list
its children and insist that no other purported children are legitimate.
(4) A satisfactory solution to the "package" problem must deal with the
issues of
- including a module at more than one place in the space of packages
- including more than one version of a module in a complete system
If you consider these issues, you discover that the one thing that
a module must NOT do is to know its exact place in a hierarchy of
packages.
(5) For large scale systems, the rules for binding modules together
have to be outside the modules.
For a long time SASL felt wrong to me: I thought most of the
information in SASL files belonged in the modules. But then I
studied LACE, and understood what problems the LACE design was
trying to solve, and realised that I was wrong about where the
information belonged. It's also worth looking at the
configuration/distribution language that was developed for MESA.
(6) Global name spaces are a pain for really large systems.
The global process registry (ANY global process registry) is
an idea whose time has gone.
The global module registry (ANY global module registry, whether
module names are simple or dotted) is also an idea whose time
has gone.
Yes, this means that apply/3 and spawn/3 and so on need rethinking.
Now that Erlang has closures, there are alternatives...
(7) JAM was not the last word in Erlang implementations.
BEAM will not be the last word in Erlang implementations.
"BEAM does so-and-so in such-and-such a manner" may decide what
designs are easy to experiment with atop BEAM, but must not be
allowed to dictate which designs are considered at all.
"Vlad Dumitrescu" <vlad_dumitrescu@REDACTED> wrote:
[about using a per-module renaming table]
For the first, that would mean that every remote call will have
to check first the internal module table for a rename. That
would make these calls much slower.
This is an assertion without any evidence to back it up.
The experience of Smalltalk has, I believe, provided evidence
that the assertion is untrue.
Again, if a module name is sent as a function argument, the
local name wouldn't make sense to any other module.
This is definitely a problem with the existing scheme that we are
discussing. The only way around it that does not introduce
inconsistencies into the language is to ban module name abbreviation.
Consider
foo:bar()
and
(X = foo, X:bar())
and
(X = foo, Y = bar, apply(X, Y, []))
where foo is the local abbreviation of ick.ack.uck.foo
If these DON'T do the same thing, the language is inconsistent.
If they DO do the same thing, then apply/3 _has_ to use a per-module
renaming table somehow.
Finally, what about functional objects? The environment for a
closure would need to include the rename table from the defining
module.
I'm assuming standard terminology where "environment" refers to variable
bindings and "code" refers to the static data (including literals). No,
the module context for a closure would _not_ be part of its environment,
it would be part of (the literals of) the code.
Again, either you do this or you introduce inconsistency into the language.
Consider
foo:bar()
and
(fun () -> foo:bar() end)()
If these DON'T do the same thing, the language is inconsistent.
If they DO do the same thing, then closures have to know how to do the
same renaming that the statically enclosing module would.
I am not proposing any new kind of renaming. In these examples I am
only referring to the kind of local abbreviation in the *existing* scheme.
These problems are solvable, but look like are imposing more
restrictions for the programmer than they remove.
These problems *must* be solved if the *existing* scheme is not to result
in an inconsistent language. The Simplest Thing That Could Possibly Work
is to ban _all_ module renaming and abbreviation.
>In short, from the inside, there appears to be no significant difference
>between a module name like foo_bar_baz and a module name like foo.bar.baz.
Only that locally you can use baz:fun(...) instead of the full
module name.
And we have now established that that kind of renaming EITHER yields
an inconsistent language OR requires the kind of per-module renaming
I mentioned before anyway.
And also that maybe one wants/needs to rename the 'foo'
application, and then not all module names have to be updated.
Actually, this is one of my main concerns, and it is one of the most
important reasons why I feel that ANY dotted name scheme is a bad idea.
More information about the erlang-questions
mailing list