[erlang-questions] unicode in string literals

Mon Jul 30 15:52:00 CEST 2012

Valid point. I didn't say the solution should be necessary used, I just
gave a solution which gives an answer for the raised problem. How it is
used, I think Joe doesn't need any other instruction (especially from me).
:)

CGS

On Mon, Jul 30, 2012 at 3:23 PM, Richard Carlsson <
carlsson.richard@REDACTED> wrote:

> On 07/30/2012 03:06 PM, CGS wrote:
>
>> Hi Joe,
>>
>> You may try unicode module:
>>
>> test() -> unicode:characters_to_list("a∞**b",utf8).
>>
>> which will return the desired list [97,8734,98]. As Richard said, the
>> default is Latin-1 (0-255 integers).
>>
>
> No! Don't save a source file as UTF8, at least without a way of marking up
> such files as being special. The problem is that if you do the trick above,
> you have to ensure that you convert _all_ string literals explicitly this
> way (at least if they may contain characters outside ASCII). But if you
> have a character such as ö, or é, in a string and you forget to convert
> explicitly from UTF8 to single code points, then that "é" will in fact be 2
> bytes, while in another module saved in Latin-1, the string "é" that looks
> the same in your editor will be a single byte, and they won't compare
> equal. Having modules saved with different encodings is a recipe for
> disaster (in particular when it comes to future maintenance). Erlang
> currently only supports Latin-1 in source files; until that is fixed, you
> should keep your UTF8-data in separate files.
>
>    /Richard
>
>
> ______________________________**_________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/**listinfo/erlang-questions<http://erlang.org/mailman/listinfo/erlang-questions>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120730/789924d9/attachment.htm>