[erlang-questions] unicode in string literals

Tue Jul 31 09:05:54 CEST 2012

Is "encoding(...)"  a good idea?

There are four reasonable alternatives

    a) - all files are Latin1
    b) - all files are UTF8
    c) - all files are Latin1 or UTF8 and you guess
    d) - all files are Latin1 or UTF8 or anything else and you tell

Today we do a).

What would be the consequences of changing to b) in (say) the next
major release?

This would break some code - but how much? - how much code is there
with non Latin1 printable characters
in string literals? - it should be easy to write a program to test for
this and flag sting literals that
might causes problems if the default convention was changed.

/Joe

On Tue, Jul 31, 2012 at 12:44 AM, Richard O'Keefe <ok@REDACTED> wrote:
> The thing that puzzles me about Erlang assuming that source files are in
> Latin 1 is that I have a tokenizer for Erlang that assumes Latin 1 and
> in every Erlang/OTP release I've checked there has been at least one
> file it tripped up on because of UTF-8 characters.
>
> When can we expect -encoding('whatever'). to be supported?
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions