[erlang-questions] Atom Unicode Support

Tue Feb 2 13:59:59 CET 2016

On 2016年2月2日 火曜日 07:26:25 you wrote:
> On 02/02, José Valim wrote:
> >> Are you ready to edit chineese or thai?
> >
> >I can already write atoms in Afrikaans, Albanian, Basque, Breton, Corsican,
> >Danish, English, Faroese, Galician, German, Icelandic, Indonesian, Italian,
> >Kurdish, Leonese, Luxembourgish, Malay, Manx, Norwegian, Occitan,
> >Portuguese, Rhaeto-Romanic, Scottish Gaelic, Spanish, Swahili, Swedish,
> >Walloon.
> >
> >Source: https://en.wikipedia.org/wiki/ISO/IEC_8859-1
> >
> >And I can only effectively speak 3 of those. There are already a lot of
> >programmers who can write in their own language and I don't expect to edit
> >their code.
> >
> >Not only that, there are other programming languages which have already
> >adopted Unicode Support and the amount of times I had to read code in a
> >language I don't understand has been exactly 0.
> >
> 
> I want to also voice my support for Unicode support.
> 
> I don't know why everytime unicode is brought up in this mailing list a 
> bunch of people suddenly fear having to edit code in a language they 
> don't understand.  This has been possible already for a long time (as 
> pointed out by the list of languages José added there). I think last 
> time someone was being preemptively angry because they could be buying a 
> business where code was in a different language and then they would be 
> screwed! The horror.

I don't know that anyone is objecting, exactly. For my part I'm just
hoping that its not (yet another) naive implementation that traps us with
a "darn, that was soooo close to actually being really useful" sort of
situation.

For example, if there were a way to leave the door open to atoms being
recognized unquoted based on script set (the same way capitalization works
with Latin characters), that could indeed be super useful (I'm being selfish
here, of course, and thinking of the kana situation in Japanese).

A comment earlier was that "Japanese prolog programmers have been writing
their entire programs in Kanji for years." <- something I have seen exactly
zero cases of outside of teaching examples. I've never even seen this at NEC,
and they are pretty big on taking advantage of native language facilities
or building them themselves when possible. Sure, its *possible*, but its
also a pita most of the time. Its a lot easier to just stay in romaji input
mode than flip around between kana and kanji symbols and halfwidth ASCII
symbols necessary for brackets, colons, arrows and whatnot (granted, some
languages actually do accept full-width symbols for arrows, brackets, etc,
but that's a LOT more rare than just accepting kana/kanji characters and
thinking that's sufficient).

The only place in CJK land I've ever seen much use of unicode source is
in various lisp-based scripting and control languages (because they usually
accept both types of parens and full-width numbers).

I'm all for unicode. It is indeed looooong overdue. I hope that whatever way
it happens to get implemented makes it actually useful instead of just being
another "me, too!" sort of feature... which is how I will feel if it turns
out like most languages: Possible -> Yes; practical -> no. Some of Richard's
examples were especially tricky-looking, and they weren't even hw/fw type
issues.

Speaking of being overdue... when we say "atoms" do we also mean modules
and functions or... ?

The concern that suddenly code will become unreadable because the Tower of
ASCII Babel has fallen is ridiculous. I'm more interested specifically in
the opposite problem, where unicode support exists, but it isn't useful
enough to make it a practical option in real code intended to do more than
just introduce kids to programming.

-Craig