Language change proposal

Sat Nov 1 10:40:52 CET 2003

Håkan Stenholm wrote:
> There is also the problem of mixing native language identifiers with the 
> english ones from the OTP libs, which is bound to look rather odd and 
> might possibly be confusing in some cases, where the words mean 
> different things in each languages. It also limits the portability of 
> the code as fewer people can understand it - imagine Linux written in 
> finish.

That's less of a problem: anybody who writes code that should be used by 
an international audience knows that he should write in English.
That's also the reason why Linux is written in English - had Linus stuck 
to Finnish identifiers, Linux wouldn't be an international platform.

>> 1) If somebody gives me software to maintain, I might hit a, say, 
>> Chinese glyph somewhere. I'd have to download the proper font just to 
>> be able to look at the sources.
> 
> I might also be just a bit tricky to figure out how to write the 
> glyph/s, if it's something like japanese, chinese or korean.

The software that displays Unicode is supposed to do that for you.
Actually there are issues that I haven't seen properly handled yet; for 
example, one Far-East script (Indonesian IIRC) has glyphs that /go 
around/ their neighbouring glyph.
Human writing is indeed a strange, aesthetically wonderful but 
technically over-complicated beast - and Unicode is designed for 
aesthetics and completeness, not for making life easy on the programs 
that use it.

>> Unicode also has issues with letter case.
> 
> Isn't this really a kind of design error/bug/feature in erlang ?
> While I personally would prefer code to be written in english I don't 
> see any real problems with using Unicode.

I don't, too - but why use Unicode if you're writing in English anyway? 
Even 7-bit ASCII is enough. Heck, even the common subset of EBCDIC and 
ASCII would be enough!

 > The simplest way would
> probably be to introduce some kind of standard upper case marker 
> (character) in the case that there is no upper case version of a 
> character. Another somewhat more confusing choice would be to require 
> that functions can only start with upper case Unicode letters (possibly 
> only the characters supplied in the current erlang character set).

Too complicated, too much of a burden on the programmer to remember 
correctly, too much of a burden on the maintainer to interpret correctly.

At least that was my initial reaction. Seeing a concrete example of how 
this is done elegantly in practice, I might reconsider :-)

>> With one exception: it would be very nice if the language allowed 
>> Unicode within string literals. That's more a question of how to 
>> integrate binary data into source code well.
> 
> It might also be useful in comments, if they aren't written in english - 
> japanese, russian and other languages that have completely different 
> character sets will be rather tedious to encode in some kind of 
> ASCII/latin1 version.

Agreed.
Though the Russians tend to manage somehow - I've been seeing a lot of 
Russion software lately.
Actually, all the non-Western languages have ways of transliterating to 
Western script. AFAIK there are even several schemes to choose from for 
any such language.

Re comment usage: In my book, comments are an integral part of the 
source code. If a comment isn't necessary to understand the code, it's 
redundant and should be removed, if it's necessary, it should be written 
in the same language as the source code.
 From this point of view, there's no need for extra allowances in comments.
Things might be different in programming courses - students will be 
pretty occupied with wrapping their minds around programming concepts, 
having to transliterate and translate would be an additional and 
unwanted burden.

Just my 2c.

Regards,
Jo