[erlang-questions] Erlang 3000?

Toby Thain <>
Wed Nov 19 16:50:53 CET 2008


On 19-Nov-08, at 9:04 AM, Johnny Billquist wrote:

> Richard Carlsson wrote:
>> Bengt Kleberg wrote:
>>> the facts of current German orthography are that the
>>> uppercase of ß is "SS"
>>
>> Quite. The lesson should be that even "within the limitations of
>> Latin-1", the idea that you can do case conversion on single
>> code points is wrong. It is an operation that should be applied
>> to strings, not individual characters.
>
> And I don't agree. You are mixing semantics with syntax, in my mind
> (syntax is probably not the right word here, but I'm no typographer  
> so I
> don't know the correct term, but I hope you understand what I mean).
> There is no uppercase version of ß, so it can't be converted to  
> uppercase.
> The fact that you write SS instead of ß, when you want it in uppercase
> don't mean that it's the same letter, just that it has the same  
> meaning.
>
> Conversion of a string to uppercase can be regarded in two ways.  
> Either
> you replace each character with it's uppercase version, and characters
> that don't have an uppercase version you leave be.
>
> Or you can try to convert the string as such to an uppercase version,
> where some letters might need to be replaced by sequences of other
> characters.
>
> I personally usually are satisfied with the previous, but I guess  
> that's
> anyones choice.

How could a German reader *possibly* be satisfied with the incorrect  
result of the over-simple-minded method!!

>
> And I also believe that this is one of the more serious flaws of
> Unicode. It mixes semantics with syntax. So you have, for instance
> several A-ring characters, for use in different type of contexts, but
> that is all artificial and unfortunate.
> It's like in the old days, when you had several different minus  
> signs on
> punched cards, for different uses. Hmm, looking at Unicode, I can see
> that they have reintroduced this ambiguity. You have hyphen-minus
> (U+002D), hyphen (U+2010) and minus (U+2212) and you also have a  
> number
> of different dashes.
> Try to figure out which one you want when you are writing.

Nothing substitutes for a typographic education if you wish to use a  
typographic palette.

> (According to one myth this "problem" actually caused the Mariner 1 to
> fail and self destruct, since the poor Fortran programmer hade used a
> hyphen instead of a minus for a constant. Not sure if it's true or  
> not,
> and the web don't give a sure answer.)

Unlikely to be true, but a failure of testing, surely.

>
> (Oh, and the A-ring problem is that there is a unit called Ångström,
> which uses the symbol Å. However, in Swedish, A-ring (Å) is a normal,
> plain letter, and the guy Ångström was a Swede, and the unit was named
> after him, with the first letter of his last name as the unit, but  
> with
> Unicode we now need to know if we're writing the letter Å, or the unit
> Å, which is a different codepoint, even though it actually is the same
> letter.
> There are more examples like this,

Hundreds more (for example, Hebrew or Fraktur letters as mathematical  
symbols). And there may be an associated rationale.

> where Unicode mess things up because
> it mix the visual impression of a character with semantic meaning  
> of the
> character.)
>
> And when I learned German in school many years ago, I was taught  
> that ß
> was more or less the equivalent of sz. :-)

Is that relevant? The orthographic rule is what it is.

--Toby

>
> 	Johnny
> _______________________________________________
> erlang-questions mailing list
> 
> http://www.erlang.org/mailman/listinfo/erlang-questions




More information about the erlang-questions mailing list