[erlang-questions] unicode in string literals

CGS <>
Mon Jul 30 15:06:38 CEST 2012


Hi Joe,

You may try unicode module:

test() -> unicode:characters_to_list("a∞b",utf8).

which will return the desired list [97,8734,98]. As Richard said, the
default is Latin-1 (0-255 integers).

As for binaries, the same problem (assuming Latin-1).

CGS




On Mon, Jul 30, 2012 at 2:35 PM, Joe Armstrong <> wrote:

> What is a literal string in Erlang? Originally it was a list of
> integers, each integer
> being a single character code - this made strings very easy to work with
>
> The code
>
>     test() -> "a∞b".
>
> Compiles to code which returns the list
> of integers [97,226,136,158,98].
>
> This is very inconvenient. I had expected it to return
> [97, 8734, 98]. The length of the list should be 3 not 5
> since it contains three unicode characters not five.
>
> Is this a bug or a horrible misfeature?
>
> So how can I make a string with the three characters 'a' 'infinity' 'b'
>
> test() -> "a\x{221e}b"        is ugly
>
> test() -> <<"a∞b"/utf8>>   seems to be a bug
>                                             it gives an error in the
> shell but is ok in compiled code and
>                                             returns
> <<97,195,162,194,136,194,158,98>> which is
>                                             very strange
>
> test() -> [$a,8734,$b]       is ugly
>
> /Joe
> _______________________________________________
> erlang-questions mailing list
> 
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120730/b7c44715/attachment.html>


More information about the erlang-questions mailing list