[erlang-questions] correct terminology for referring to strings

Richard O'Keefe ok@REDACTED
Wed Aug 1 06:45:15 CEST 2012


On 1/08/2012, at 4:49 AM, Richard Carlsson wrote:

> Yes. That's why there needs to be a new Unicode-aware string library. Operating directly on lists (e.g. using lists:reverse/1, or even length/1) is always going to have surprising effects, and the old 'string' module in stdlib probably can't be modernized while maintaining backwards compatibility.

It should be noted that the "length" of a Unicode string is an inherently
ambiguous concept anyway.  It makes sense to ask how many codepoints
there are in the string as given, or how many there would be in a particular
normalisation form, but neither of those is how many characters the *user*
would count (which is not just script-sensitive, not just locale-sensitive,
but very context-sensitive).  Oh, and none of them is the same as "how many
columns would this require in a fixed-width font."





More information about the erlang-questions mailing list