[erlang-questions] The importance of Basic Unicode Understanding in Erlang
Thu Sep 29 00:52:16 CEST 2011
On 29/09/2011, at 10:14 AM, Richard Carlsson wrote:
> - The "good old length and comparison functions" are not broken, they just answer much simpler questions than what you're asking. length(S) tells you how many code points are in string S, no more, no less. Not glyphs, not graphemes, not abstract characters. Code points.
I should point out that the question "how many characters are there" is locale-dependent.
My mother's father, looking at the place name "Ǉubǉana" would have seen 7 letters.
I see 9. (There are in fact 7 Unicode code points. Who said one code point couldn't
count as more than one letter?) Looking at my Father's middle name: "Æneas", I see
5 letters. (Unicode agrees with me.) Other people see 6.
This means that there is no such thing as a "unicode" function
grapheme_length :: String → Integer
but only a function
grapheme_length :: String × Locale → Integer
This is only the beginning of the problems!
> Similar for comparisons.
And again, similar for comparisons.
More information about the erlang-questions