[erlang-questions] How to convert UTF8 string to lowercase?

Fred Hebert mononcqc@REDACTED
Thu Feb 9 18:08:12 CET 2012


As far as I know, the string module is all about lists. Erlang Unicode is
handled in different ways. 1. Lists of Unicode code points (no specific
encoding), 2. Binaries with a specific encoding (utf-8, utf-16, utf-32).

A utf-8 list will be seen as a list of bytes; not a unicode string proper.
It will have no notion of codepoints or encoding and will be equivalent to
latin-1. You need to convert back to a list of codepoints with
unicode:characters_to_list(list_to_binary(ByteList)), or to an encoded
binary with unicode:characters_to_binary(IoList).

In any case, this is as far as Erlang goes when it comes to unicode support
as far as I know. There is no library to handle and transform it, just to
convert and transport it.
On Feb 9, 2012 11:35 AM, "Martin Dimitrov" <mrtndimitrov@REDACTED> wrote:

> Hello,
>
> In the string module no word is mentioned about Unicode. Can to_lower/1
> be used to covert Utf8 encoded string to lowercase?
>
> I remember I read somewhere that these functions are not Unicode safe.
> So what else can be used?
>
> Regards,
>
> Martin
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120209/c9f62925/attachment.htm>


More information about the erlang-questions mailing list