[erlang-questions] correct terminology for referring to strings

Michael Turner michael.eugene.turner@REDACTED
Wed Aug 1 09:32:57 CEST 2012

On Wed, Aug 1, 2012 at 1:33 PM, Richard O'Keefe <ok@REDACTED> wrote:
> On 31/07/2012, at 9:53 PM, Michael Turner wrote:
>>> << An Erlang "string" is simply a list of integers.  Each integer can
>>> represent any Unicode codepoint/character. >>
>> Except that Unicode codepoints represents characters, right?
> Wrong.

Actually, what's *really* wrong in my statement is the grammar -- bad
plural agreement.

> One Unicode codepoint may represent what a particular language
> views as two distinct graphemes.  (This occurs in encoding English,
> for example: in 'belovéd' the diacritical mark is a stress accent
> and so é counts as two separate graphemes.)

[snip much more]

I'm certain this is correct, Richard, but ... what problem are we
trying to solve again? IIRC: Joe is trying to come up with a short
passage that explains what strings are, in Erlang. If he writes all
that you wrote above, the reader (who might have been initially
excited about Erlang) will come away with the impression, "Erlang
people are excruciatingly pedantic".

> I keep meaning to write a small book called "Strings Made Difficult."

Sounds like you're the man to do it, Richard.

As I wrote earlier:

>> << In Erlang, strings are represented as lists of integers. These
>> integers are Unicode codepoints, each representing a character. >>
>> That way, anybody who's unclear on what "codepoint" means gets a
>> freebie definition of it. In the Unicode context, it's probably wrong,
>> technically, but perhaps good enough for this purpose.

Can anyone tell me why this *wouldn't* serve Joe's (== the typical
reader's) purposes? [*]

-michael turner

[*] Why do I suspect we're now going to have a long digression on
whether the "==" in  "(== the typical reader's)" should really be

>> On Tue, Jul 31, 2012 at 6:41 PM, Paul Barry <paul.james.barry@REDACTED> wrote:
>>> Hi Joe.
>>> I think "string literal" is pretty widely understood (it even has a
>>> WikiPedia entry, here: http://en.wikipedia.org/wiki/String_literal).
>>> What threw me about your sentence was the use of the word 'codepoint',
>>> which will be OK for those already familiar with Unicode, but might
>>> confuse those who are not.  My feeling (and this might be a gross
>>> over-simplification) is that most North-American programmers know
>>> about Unicode but don't let it worry them too much, resulting in less
>>> of a familiarity with it than might be necessary (and I apologize to
>>> any North-American programmers that this comment rubs the wrong way).
>>> Perhaps "unicode characters" might be easier to read/understand?
>>> Although not probably totally technically correct...
>>> Another thing that you might wish to consider is breaking the sentence
>>> in two, as follows:
>>> << An Erlang "string" is simply a list of integers.  Each integer can
>>> represent any Unicode codepoint/character. >>
>>> Just my 2 cent.
>>> Paul.
>>> On 31 July 2012 10:24, Joe Armstrong <erlang@REDACTED> wrote:
>>>> I'm working on a 2'nd edition of my book, and have got to strings :-)
>>>> Strings confuse everybody, including me so I have a few questions:
>>>> To start with Erlang doesn't have strings - it has lists (not strings)
>>>> and it has string literals.
>>>> I want to define a string - is this correct:
>>>> << A "string" is a list of integers where the integers
>>>>      represent Unicode codepoints. >>
>>>> Questions:
>>>>    Is the sentence inside << .. >> using the correct terminology?
>>>>    If not what should it say?
>>>>    Is the sentence inside << ... >> widely understood, do you think this
>>>>    would confuse a lot of people?
>>>>    Is the phrase "string literal" widely understood?
>>>> Cheers
>>>> /Joe
>>>> _______________________________________________
>>>> erlang-questions mailing list
>>>> erlang-questions@REDACTED
>>>> http://erlang.org/mailman/listinfo/erlang-questions
>>> --
>>> Paul Barry, w: http://paulbarry.itcarlow.ie - e: paul.barry@REDACTED
>>> Lecturer, Computer Networking: Institute of Technology, Carlow, Ireland.
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions

More information about the erlang-questions mailing list