[erlang-questions] Strings as Lists
Sat Feb 16 00:02:31 CET 2008
> My question was in regards to reversing strings, not lists of
> Specifically, Hasan Veldstra's complaint that representing strings
> as lists
> doesn't work when you use lists:reverse to reverse them:
That wasn't what I said. I gave an example of when a string reversal
would fail as a consequence of treating Unicode codepoints as
characters ("characters" from user's point of view, not how Unicode
>> This would not work on a string with combining characters, e.g. ü
>> represented as u followed by ¨, or a CJKV ideograph.
>> A lot of glyphs *cannot* be represented by a single Unicode
> Your example is a case on "unreversing" a reversal done during the
Sorry, I'm not following you here. I didn't even mention upcasing in
my last message.
> My guess is that in the "ü represented as u followed by ¨" case,
> it would
> work just right: the "u" would be up-cased to "U", and the "¨"
> would follow
> capital "U" (following the list:reverse to unreverse the list).
Yes, maybe this would work, thanks to Erlang's awareness of Western
European scripts. How would you convert this string to uppercase in
Erlang though: "Καλημέρα κόσμε"? With libraries that are
available now, it's impossible.
How about doing case-insensitive comparisons of strings containing
Russian text? Or even doing a case-insensitive comparison of "straße"
and "STRASSE"? Again, no library support.
Or how about comparing two strings that look identical when printed,
but one of them contains the pre-composed "ü" character, while the
other contains "u" followed by "¨"? Again, you can't do this and
similar comparisons reliably using plain lists. Unless you implement
Unicode from scratch yourself, of course.
> I don't think up-casing a CJKV ideograph makes any sense
I know little about East Asian scripts, and I don't know if they have
the uppercase/lowercase distinction, but I never said you'd want to
upcase a CJKV ideograph.
> So the question goes back to Mr. Veldstra (or anyone) as to why you
> want to reverse a Unicode string
I don't know. String reversal was a convenient example for the point
I was trying to make.
More information about the erlang-questions