[erlang-questions] A proposal for Unicode variable and atom names in Erlang.

Tue Nov 6 04:24:45 CET 2012

On 6/11/2012, at 1:35 PM, Steve Davis wrote:
> Suppose the author writing in a natural language where the *exact same unicode characters* have entirely different semantics?

There's a science fiction story (sorry, I forget the title and author)
where one gimmick is the ambiguity of "Pet Shop".

> I have enough of a hard time with computer languages without having to know over 200 natural languages to boot.
> 
> Is the right decision, perhaps, to say that we need to agree on just one natural language for source

No.  It is that each *exchange* needs to involve an agreed language.

When I was at Quintus, we had a company in Israel develop some graphics
software for us.  (Good software too, but for unrelated reasons we never
shipped it that I know of.)  You say in the contract that the documentation
will be in English (although several of our people could read Hebrew) and
you say that the code and comments will be in English too.

What Unicode makes possible is a contract where a company in Israel asks
a company in the US to provide documentation and code in Hebrew, and
there is no technical barrier to them doing it.  It also lets the
Israelis write scaffolding code in Hebrew if they want to.

We do not need "One Ring to rule them all and in the darkness bind them".
English for everything would suit me fine, if it _was_ English, and not
American (:-).

> - since that means you need to learn at most two languages? (And, also, did that natural language decision not happen already in every major computer system?)

Every major computer system has been busy unmaking that decision for
decades. 
> 
> If you think it's a good idea to change that status quo, then please let me know which natural language to use (yes, even if the choice were not a natural language that I currently know), just so I have a limit on where I need to educate myself. I have enough issues with encodings without being asked to learn every natural language in existence.

Nobody is asking you to do that.
For one thing, there are about six or seven thousand natural languages
in existence.  Unicode covers dozens of _scripts_ that I've never heard
of.  Heck, it includes scripts that nobody in the whole world can _read_.
(Unless you believe that the author of 'Code Breaker' got it right, and
I thought he was pretty convincing.)  Yes, I do mean U+101D0 to U+101FD,
the PHAISTOS DISC SIGN ... characters.

We are *not* talking about something new here.
As I keep pointing out, *nothing* stops people writing Erlang
in Klingon.  They don't even have to leave ASCII for that.
It's just that _if_ they do, they have to take the consequences of
nearly everyone else being unable to read it.
Nobody has forced you to learn Klingon just because it's possible
to write Erlang in Klingon, have they?

Or let's take a real example.  Erlang currently uses Latin-1.
Latin-1 lets you write Icelandic.  Has anybody been dumping Icelandic
Erlang on your desk, _expecting you to read it_?

Unicode introduces the problem that Erlang code might be written in
a *script* that you cannot read.  But the problem that it might be
in a *language* you cannot make head or tail of has been with us for
a long time, and the sky has not fallen.