[erlang-questions] A proposal for Unicode variable and atom names in Erlang.

Richard O'Keefe ok@REDACTED
Tue Oct 30 05:11:06 CET 2012


On 22/10/2012, at 7:44 PM, Rustom Mody wrote:
> 1. 
> Python made a choice to embrace unicode more thoroughly in going from python 2 to python 3.  This seems to have caused some grief in that 'ASCII' code that used to work in python 2 now often does not in python 3. Maybe this has nothing to do with Richard's EEP because that is about the string data structure this is about variable names. Still just mentioning.

Can you be more specific?  Each ASCII character has the same numeric value
in Unicode, and an ASCII string represented as UTF-8 is exactly the same
sequence of bytes.  I can't help wondering if "ASCII" here really means
some 8-bit character set rather than ASCII.

> In all fairness (for Yurii's points) I should mention:
> 1. I was typing this on a windows box and could not see the characters until I switched to linux
> 2. Our computers may become completely, effortlessly unicode-capable someday, our keyboards will never. So to the extent that code is meant to be written, ASCII will always trump.  To the extent that it is to be read, a richer (within limits) character set has its attractions.

You are assuming that everyone who is using a keyboard is using a US keyboard.
That's not true.  For example, on a visit to Sweden, I was allowed to use my
host's computer to read my mail remotely, and my fingers kept tripping up
because it was a Swedish keyboard with lots of non-ASCII characters.
Heck, my wife has an iPad, and I have one on loan from the department, and
both of them have Greek keyboards installed, making it pretty much effortless
to type Greek, which I assure you is NOT ASCII.  It's just a matter of touching
the globe symbol and flicking over to the other keyboard.  This is old technology.
The Xerox D-machines had fast-switch virtual keyboards back in the 1980s.
It takes two mouse movements to switch from a US keyboard to a Greek one on my
desktop Mac (or to a Hebrew one or a Russian one or ...).

RIGHT NOW, our keyboards ARE completely, effortlessly non-Latin-1 capable.

Nobody is suggesting that any one programmer will want to use all 100,000+
Unicode characters in the same document.  What is suggested is that some
programmers, who can effortlessly type Russian on their Russian keyboard or
Gujarati on the Gujarati keyboard -- both of which Windows supports -- and
see that on their screen, should be able to do so.

I cannot for the life of my understand why, at this late date, anyone should
for an instant suppose that only ASCII can be easily typed.

As it happens, for my national needs, the Mac _does_ have Māori keyboard
support.  It's two mouse movements to switch from US keyboard to Māori one,
and then getting a vowel with a macron is just a matter of pressing the
Option key while typing the vowel.  A Māori student would have little reason
ever to switch over to the US keyboard.  I can certainly type words like
kurī and kīrehe and Ākarana without taking my fingers from the keyboard.

The idea that "ASCII will always trump" on account of being easier to type
deserves some kind of award for wrongness.
> 



More information about the erlang-questions mailing list