[erlang-questions] Strings as Lists

Bjorn Gustavsson bjorn@REDACTED
Fri Feb 15 12:01:55 CET 2008


Richard Carlsson <richardc@REDACTED> writes:

> What Erlang needs to support non Latin-1 languages, is filters for decoding
> input and encoding output. (Right now, you have to write the conversion
> functions yourself if you want to work with Russian text.) The internal
> string representation - lists of integers using one integer per code
> point - needs no modification, whether it's ASCII, Latin-1, or Unicode;
> what I said before applies equally well to all of them. Multibyte encodings
> are not practical for general string manipulations regardless of how they
> are stored in memory.

I can confirm that is possible to use lists of Unicode characters, and quite
easy too. In Wings 3D, I have implemented my own limited support for Unicode.

1. To translate lists of UTF-8 characters to lists of Unicode characters,
there is the function wings_util:expand_utf8/1. (Wings keeps all text strings
for other languages than English in text files, which are read as needed.
If you want to have Russian text in strings in the actual source code files,
you could write a simple parse transform to handle the translation.)

2. As a simple replacement for io:format/2, there is
wings_util:format/2 which doesn't have all the functionality of
io:format/2, but allows arguments to ~s to be lists containing Unicode
characters.

3. For output, Wings has its own fonts and font handling (meaning that Wings
has no need to translate back to UTF-8 on output).

/Bjorn
-- 
Björn Gustavsson, Erlang/OTP, Ericsson AB



More information about the erlang-questions mailing list