[erlang-questions] correct terminology for referring to strings

Richard O'Keefe ok@REDACTED
Thu Aug 2 03:28:35 CEST 2012


On 2/08/2012, at 6:42 AM, Thomas Lindgren wrote:
> 
> How about adding compiler warnings about string literals that do not obey
> the designated encoding? (There should then, of course, be multiple possibilities to choose from.)
> 

What does this actually mean?

There is no byte sequence valid in UTF-8 that is not also
valid in Latin-1.  Yes, codes 128..159 are control characters,
but nobody ever said that control characters weren't legal in
strings.  Checking the mappings that came with Unicode 4,
there is no byte sequence valid in UTF-8 that is not also
valid in ISO 8859-{1,2,4,5,9,10,13,14,15}, PC code pages
437, 737, 775, 850, 852, 885, 86[012356], and Apple Arabic,
Central European, Croatian, Cyrillic, Farsi, Greek, Hebrew,
Icelandic, Roman, Romanian, Squeak, and Turkish.

So I have no idea what "string literals that do not obey
the designated encoding" means or how to operationalise it.




More information about the erlang-questions mailing list