[erlang-questions] correct terminology for referring to strings
Tue Jul 31 13:48:31 CEST 2012
On 31/07/2012 10:24, Joe Armstrong wrote:
> I'm working on a 2'nd edition of my book, and have got to strings :-)
> Strings confuse everybody, including me so I have a few questions:
> To start with Erlang doesn't have strings - it has lists (not strings)
> and it has string literals.
> I want to define a string - is this correct:
> << A "string" is a list of integers where the integers
> represent Unicode codepoints. >>
I think this is technically correct, but it is very confusing because it
implies that the source file may be encoded as unicode.
As I understand it, source files are always treated as being in Latin-1.
This means that string literals are lists of Latin-1 values, and not
lists of unicode codepoints. (The values from 128 to 255 have
different/no meanings, and values > 255 will not happen).
If you encode your source as something other than Latin-1, the result is
a miss-coding of your literal string, with all the problems that
presents (vanishing and reappearing characters, wrong lengths etc.).
The REPL does take notice of the locale and so can produce different
results from the same source strings!
I don't envy you the task of writing something that is clear, correct,
concise and comprehensible. That will be a challenge!
> Is the sentence inside << .. >> using the correct terminology?
> If not what should it say?
> Is the sentence inside << ... >> widely understood, do you think this
> would confuse a lot of people?
> Is the phrase "string literal" widely understood?
> erlang-questions mailing list
More information about the erlang-questions