[erlang-questions] Is this proper way to convert a latin string to utf8 string

Jesper Louis Andersen jesper.louis.andersen@REDACTED
Sun Jul 26 21:37:34 CEST 2015


On Sun, Jul 26, 2015 at 9:30 PM, 王昊 <jusfeel@REDACTED> wrote:

> I am using a web framework(Chicagoboss). I posted the data into Erlang in
> Chinese in utf8 encoded string from a web form. It is read by Erlang as
> [232,191,153]. This is just one single Chinese character. But erlang read
> it as [232,191,153]. So I want to consume via ajax later on on the client
> side.


Ah, you have three bytes, 232, 192, 153 in utf8 representation and want to
convert those into a unicode codepoint. Here I have a binary with a bit of
other utf8 characters in it (since ASCII and utf8 overlaps):

9> B = <<232, 191, 153, $h, $e, $l, $l, $o>>.
<<232,191,153,104,101,108,108,111>>
10> [CP || <<CP/utf8>> <= B].
[36825,104,101,108,108,111]



-- 
J.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150726/fcb876a5/attachment.htm>


More information about the erlang-questions mailing list