Packages in Erlang: new documentation

Fri Sep 5 07:37:33 CEST 2003

On Fri, 5 Sep 2003 13:05:39 +1200 (NZST)
"Richard A. O'Keefe" <ok@REDACTED> wrote:

> I pointed out that apply/3 and spawn/3 and suchlike could be made
> to work in the presence of module renaming.
> 
> This is a good time to point out that the longer I ponder this,
> and I have been thinking about Erlang modules for about 7 years now,
> the clearer it seems to me that
> 
> (1) Module names must be completely decoupled from file names.
>     [This does not mean that you cannot have a _default_ mapping
>     from module names to file names, only that you cannot have a
>     _fixed_ mapping.]

OK, so consider the current mapping the default one, and consider the
mechanism to change mappings, not yet implemented.  As pointed out, it
would not be hard to implement seamlessly.  It just seems that no one
has yet been put in a position to need to do it, yet.

> (2) It is a bad idea to introduce names that don't actually name
>     anything.
>
>     Erlang modules are somewhat anomalous in the language.  There
>     really are module "objects", with many properties, but there are
>     not module"values".  If you don't believe that modules are objects
>     in Erlang, take a look at the 'c' and 'code' modules, and you'll
>     see what I mean. While you cannot ask a module *itself* for its
>     properties, you can ask the Erlang system for the properties of a
>     module.

If modules are objects in Erlang, then so are files and TCP/IP sockets.
And packages.
None of these are "first-class types", though; for example, you can't
say 'f(X) when is_module(X).'  All we really have are references to
modules via their names (atoms.)  Ditto packages.

>     It is possible to have a system of hierarchical modules.  That's
>     how Pop-2 does it, and that's how Ada does it, and that's how
>     Mercury does it.  In those languages, modules may contain types,
>     functions,_and other modules_.
> 
>     In contrast, this package proposal introduces a class of names
>     that do not name anything.  Where are the operations to
>         - list all loaded packages
>         - list all available packages
>         - list the contents of a package
>         - load a package
>         - delete a package
>         - purge a packge
>         - ensure that a package is replicated on another node
>     and so on?

Again, just consider them not yet implemented.  They'd be trivial to
implement (e.g. listing the contents of a package would be implemented
in terms of listing directory contents.)

>     These operations may exist, but I didn't see them in the document
>     we were asked to look at.  If they do, I would be interested in
>     reading a fuller version of the document.

I think you are building a straw man here.  It's clear to me that
package names *do* name something (they name namespaces, which are
implemented with directories) and that there *are* operations which
apply to them (even if they haven't been implemented yet.)

> (3) Having packages that do not package is unhelpful.
> 
>     Java is notorious for this:  packages are accessibility scopes,
>     but anybody can sneak new classes and subpackages into a package
>     without the package having any say about it.
> 
>     This is really a fundamental problem with Java packages, and the
>     scheme under consideration is far too Java-like for my taste.
> 
>     If there are to be packages, then a package should be able to list
>     its children and insist that no other purported children are
>     legitimate.

Why?

I mean, you obviously feel strongly about this, but you haven't
explained why it's a problem.  You're doing something similar to what
you accused Vlad of, a message or two back: making an assertion that X
is a bad thing without providing any evidence or reasoning for why it is
bad.

I guess I have a different notion of an 'accessibility scope', too -
packages aren't about restricting what can and cannot be a 'child' of
whatever else, so much as they're about:

a) organizing modules to make them easier to locate, and
b) reducing name clashes.

> (4) A satisfactory solution to the "package" problem must deal with
> the
>     issues of
>     - including a module at more than one place in the space of
>     packages- including more than one version of a module in a
>     complete system If you consider these issues, you discover that
>     the one thing that a module must NOT do is to know its exact place
>     in a hierarchy of packages.

I think this problem only shows up when the code path is eliminated.
For the sake of backwards-compatibility, it might never be.

> (5) For large scale systems, the rules for binding modules together
>     have to be outside the modules.
> 
>     For a long time SASL felt wrong to me:  I thought most of the
>     information in SASL files belonged in the modules.  But then I
>     studied LACE, and understood what problems the LACE design was
>     trying to solve, and realised that I was wrong about where the
>     information belonged.  It's also worth looking at the
>     configuration/distribution language that was developed for MESA.

Doesn't this completely contradict the idea of hierarchical modules,
where a module 'contains' other modules, then?  If the hierarchical
relationships between modules should be *outside* the modules, that
would seem to be an argument for packages as something *distinct* from
modules.

> (6) Global name spaces are a pain for really large systems.
>     The global process registry (ANY global process registry) is
>     an idea whose time has gone.
>     The global module registry (ANY global module registry, whether
>     module names are simple or dotted) is also an idea whose time
>     has gone.

Namespaces aren't registries.

To me, a namespace is conceptual.  And speaking conceptually, the top
level of a hierarchy is ALWAYS global.  How can it be anything but? 
Also, the set of fully-qualified names in a hierarchy is always
isomorphic to a single namespace.  That is, if I have a hierarchy like

a
+-a
  +-a
  +-b
+-b
  +-a
  +-b

...it maps perfectly to a flat global list like

a
a.a
a.a.a
a.a.b
a.b
a.b.a
a.b.b

On the other hand, registries are implementations of the namespace
concept.  If you're saying that each process(/module/whichever) should
only ever register its name with its immediate parent - well that would
be good design a la the Law of Demeter, but it could also have
performance consequences as name lookups become recursive.  I think the
ultimate decision for where the registries should be partitioned should
be up to the engineer for any given project.  The fact remains that
conceptually, names can always be regarded as global in some sense.

>     Yes, this means that apply/3 and spawn/3 and so on need
>     rethinking. Now that Erlang has closures, there are
>     alternatives...

Well, closures don't cut it if you want to support code change.

> (7) JAM was not the last word in Erlang implementations.
>     BEAM will not be the last word in Erlang implementations.
>     "BEAM does so-and-so in such-and-such a manner" may decide what
>     designs are easy to experiment with atop BEAM, but must not be
>     allowed to dictate which designs are considered at all.

Absolutely.  Luckily, I know nothing about the innards of BEAM so you
don't have to worry about my responses being influenced by it.

> [...]
> These problems *must* be solved if the *existing* scheme is not to
> result in an inconsistent language.  The Simplest Thing That Could
> Possibly Work is to ban _all_ module renaming and abbreviation.

Why the heck not just use a macro???

  -define(foo, ick.ack.uck.foo).
  ...
  X = ?foo, X:bar().

-Chris