[erlang-questions] Strings as Lists
Fri Feb 15 18:35:05 CET 2008
> This is what you should have in your list:
> 1> Text = [16#442, 16#435, 16#43a, 16#441, 16#442].
> You can convert it to utf8 for output
> 2> xmerl_ucs:to_utf8(Text).
> And you can reverse it and convert that to utf8.
> 3> xmerl_ucs:to_utf8(lists:reverse(Text)).
This would not work on a string with combining characters, e.g. ü
represented as u followed by ¨, or a CJKV ideograph.
A lot of glyphs *cannot* be represented by a single Unicode codepoint.
Plain lists or binaries are good enough in two cases:
1. You don't need to support anything other than ISO Latin-1 (i.e.
Western European languages).
2. You don't need to do much with the Unicode text apart from simply
storing it and spitting it back to the user as-is.
For any other case, what Erlang/OTP offers now is subpar compared to
other modern languages / platforms.
Implementing Unicode from scratch is nasty, and the DIY attitude is
unproductive and dangerous. There needs to be a standard library,
used and tested by everyone.
As I already mentioned in this thread, I'm working on such a library,
and will release an alpha version soon.
More information about the erlang-questions