[erlang-questions] Erlang 3000?

Richard O'Keefe ok@REDACTED
Thu Nov 20 02:51:46 CET 2008


On 20 Nov 2008, at 12:03 am, Johnny Billquist wrote:
>> "straße" - German for "street" - can be encoded in ISO Latin 1, but  
>> both
>> your code and the one in the standard string module will fail to  
>> convert
>> it to upper-case properly.
>
> Well, Latin-1 don't have an uppercase version of ß, so that letter  
> can't
> be converted, if you want to stay in Latin-1. The rest will do just  
> fine.

You are assume that case mapping always converts exactly one character
into exactly one character.
Even in Latin-1, thanks to $ß, that isn't true:
the upper case equivalent of that character is two characters,
both of which are Latin-1.

Back in the 1980s, when I was working on Quintus Prolog,
this assumption had already broken down.  I see the comment
"This predicate does _not_ work well with languages like German"
on a case conversion predicate in a file dated 1987 and a
similar comment in a file dated 1988, and I'm pretty sure my
awareness of the issue is related to studying the XNS 16-bit
character set in about 1985.  (The files in question do not
have histories, so they may well be older.)

If we want to get Erlang Unicode-ready, we'll have to purge
our code of character-at-a-time case mapping and replace it
by whole-string case mapping, but the problem already existed
20 years ago.





More information about the erlang-questions mailing list