[erlang-questions] unicode in string literals
Joe Armstrong
erlang@REDACTED
Mon Jul 30 16:25:54 CEST 2012
On Mon, Jul 30, 2012 at 3:06 PM, CGS <cgsmcmlxxv@REDACTED> wrote:
> Hi Joe,
>
> You may try unicode module:
>
> test() -> unicode:characters_to_list("a∞b",utf8).
>
> which will return the desired list [97,8734,98]. As Richard said, the
> default is Latin-1 (0-255 integers).
Very strange I tried that earlier, this is what happens:
$ Eshell V5.9 (abort with ^G)
1> unicode:characters_to_list([97,226,136,158,98], utf8).
[97,226,136,158,98]
The manual says the first argument is a utf8 string
/Joe
/Joe
>
> As for binaries, the same problem (assuming Latin-1).
>
> CGS
>
>
>
>
> On Mon, Jul 30, 2012 at 2:35 PM, Joe Armstrong <erlang@REDACTED> wrote:
>>
>> What is a literal string in Erlang? Originally it was a list of
>> integers, each integer
>> being a single character code - this made strings very easy to work with
>>
>> The code
>>
>> test() -> "a∞b".
>>
>> Compiles to code which returns the list
>> of integers [97,226,136,158,98].
>>
>> This is very inconvenient. I had expected it to return
>> [97, 8734, 98]. The length of the list should be 3 not 5
>> since it contains three unicode characters not five.
>>
>> Is this a bug or a horrible misfeature?
>>
>> So how can I make a string with the three characters 'a' 'infinity' 'b'
>>
>> test() -> "a\x{221e}b" is ugly
>>
>> test() -> <<"a∞b"/utf8>> seems to be a bug
>> it gives an error in the
>> shell but is ok in compiled code and
>> returns
>> <<97,195,162,194,136,194,158,98>> which is
>> very strange
>>
>> test() -> [$a,8734,$b] is ugly
>>
>> /Joe
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>
>
More information about the erlang-questions
mailing list