[erlang-questions] correct terminology for referring to strings

Ian hobson42@REDACTED
Tue Jul 31 13:48:31 CEST 2012


On 31/07/2012 10:24, Joe Armstrong wrote:
> I'm working on a 2'nd edition of my book, and have got to strings :-)
> Strings confuse everybody, including me so I have a few questions:
>
> To start with Erlang doesn't have strings - it has lists (not strings)
> and it has string literals.
>
> I want to define a string - is this correct:
>
> << A "string" is a list of integers where the integers
>        represent Unicode codepoints. >>

I think this is technically correct, but it is very confusing because it 
implies that the source file may be encoded as unicode.

As I understand it, source files are always treated as being in Latin-1. 
This means that string literals are lists of Latin-1 values, and not 
lists of unicode codepoints. (The values from 128 to 255 have 
different/no meanings, and values > 255 will not happen).

If you encode your source as something other than Latin-1, the result is 
a miss-coding of your literal string, with all the problems that 
presents (vanishing and reappearing characters, wrong lengths etc.).

The REPL does take notice of the locale and so can produce different 
results from the same source strings!

I don't envy you the task of writing something that is clear, correct, 
concise and comprehensible. That will be a challenge!

Regards

Ian
































> Questions:
>      Is the sentence inside << .. >> using the correct terminology?
>      If not what should it say?
>
>      Is the sentence inside << ... >> widely understood, do you think this
>      would confuse a lot of people?
>
>      Is the phrase "string literal" widely understood?
>
>
> Cheers
>
> /Joe
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>




More information about the erlang-questions mailing list