[erlang-questions] Why we need a -module() attribute?

Fri Feb 19 10:20:11 CET 2016

> Greetings,
>
> Given the problem with case-sensitive file systems, and identifiers in
> code, why not ignore case?

Sadly, it's not that simple.  Consider

a. i
b. İ
c. ı
d. I

Which of these count as equivalent-except-for-case depends on
the natural language they are written in.

It's not as if case-insensitivity were all that natural for
natural languages either.  "THE" is now "TUE" but "the" has not changed.
A UN-man is not an un-man.  LAX is not particularly lax.
And to get away from initials,
  Q: What's that band?
  A: The Who.
  Q: The who?
  A: That's right.
And I'll never forget Mrs Which, Mrs What, and Mrs Who from
"A Wrinkle in Time".

> It seems to work for Eiffel. No results from searching for Eiffel and
> problems with case. It is not a large language, so the number of people
> that have been exposed to case insensitivity is small.

I suspect that most Eiffel programs are still written in Western
European languages for which Latin-1 works well, except perhaps
for strings.
>
> This would not work for Erlang, but new languages could try.

Unicode makes some complicated things possible and some "simple"
things extremely difficult.  Dealing with alphabetic case is one
of them.

One of the issues with file systems and alphabetic case is that it's
just not clear what actually happens.  For example, in a Unix-like
system where a file name is a sequence of bytes excluding 0 and 47,
whether two such sequences should count as equivalent depends on
what the encoding is deemed to be.  I have certainly run into trouble
with file names written in one encoding being displayed unsuitably in
another, and the encoding is not (except in classic MacOS) something
that is stored with the file name.  (I believe MacOS X uses UTF8.)
But if we say "let file names be interpreted as UTF-8", we are *still*
left with the problem that case equivalence is determined by
natural language, not by encoding alone.

One problem with case-insensitive languages is that the same
identifier may appear as READSYMBOL, readsymbol, ReadSymbol,
or even ReadSYmbol.  Some languages have dealt with this by saying
that you can use any one capitalisation pattern you like: a name
that appears in a scope must be capitalised identically at every
occurrence.  (That amounts to saying that you must write so that
your program would work equally well in a case sensitive or a
case insensitive language.)

Then too, there are file names that are not legal identifiers.
Taking Java as an example, what should we do with a :@%%.java
file?  Or even 2016.java?

Like I said before, tying file names and module (or class) names
together is like holding a chain-saw by the blade.