[erlang-questions] Why we need a -module() attribute?

Mon Feb 22 01:45:30 CET 2016

On 22/02/16 12:08 am, Loïc Hoguin wrote:
> It seems that you consider beam and source files to be equivalent. 
Why would you say that?

What *is* true is that the name of a .beam file is derived from
the name of a .erl file, that the module name must match the
pre-extension part of the .erl file, and that the module name
must also match the pre-extension part of the .beam file.

Example:
% echo "-module(foo)." >foo.erl
% erlc foo.erl
% mv foo.beam Foo.beam
% erl
1> l('Foo').
{error,badfile}

=ERROR REPORT==== 22-Feb-2016::12:56:36 ===
Loading of /home/cshome/o/ok/COSC400/Foo.beam failed: badfile

=ERROR REPORT==== 22-Feb-2016::12:56:36 ===
beam/beam_load.c(1278): Error loading module 'Foo':
   module name in object code is foo
> A beam file should contain a module name, there's no doubt about this.
I think you need to argue for that.

 > So loading from a .ez file would work, so upgrading would work, and 
so on.

You're skipping over the key step.  As far as I know, loading a file from
a .ez file is NOT done by searching the contents of the pseudo-files
searching for one with a matching module name, but it's based on the
(pseudo-)file name, just like a file in the host file system, but with
possibly different naming rules.
> But there's no point in having the source file require the module name 
> directly in the source if the compiler can figure it out (and it can 
> *because we give it* when we ask the compiler to compile files).

Eh?  When we ask the compiler to compile FILES we give FILE names.
MODULE names look like file names (as long as you close one eye and
squint with the other) but they are not the same thing.

To make just one little distinction, for every programming language I know,
identifiers are either ALWAYS case sensitive or ALWAYS case insensitive.
But in a file system, file names in some directories may be case sensitive
and file names in other directories may not be case sensitive.

*IF* module names were restricted to lower case letters,
the compiler could figure out that 'LISTS.ERL' contained
the source for 'lists'.  But they aren't, so it can't.
For that matter, on my laptop (a Mac), for all the compiler
can tell, 'lists.erl' might be the file for 'LISTS' or 'Lists'.

> The compiler should error out if it can't find a module name *at all*, 
> not if it can't find it *in the source*.
Right now we have a combination of two things.

(1) A simple obvious human-friendly -module directive that informs
      both human beings and compilers of the name of a module,
      *regardless* of the kind of container the text is held in
      (it could, for example, be a CLOB in a data base), and at a
      cost to the programmer of somewhere between negligible and zero.

(2) A crude hack for autoloading that hopes file names can be long
      enough to hold module names (I have a VM with a mainframe
      operating system that limits names to 17 characters) and that
      hopes nobody will have two modules whose names differ only
      in case and that hopes Unicode isn't a problem.
      This really also covers loading: when you say l(foobar) Erlang
      thinks you're giving it a file name but what you usually want to
      give it is a module name.

Now (2) was fair enough at the time; it was entirely typical of the
days when everything was ASCII and 100 MB was a lot of disc.
Those days are gone.

Until (2) is fixed somehow, (1) and its analogue of 'redundantly'
storing a module name in a .beam file are our principal protection
against (2) going wrong, which it does.

(1) is not a problem.  (2) is the problem.
(1) is the (current) solution.

I have this mental picture of someone out at sea in a small boat
saying "what's this nasty old thing sticking in the bottom?" and
wanting to pull out the drain plug.