[erlang-questions] unicode in string literals

Tue Jul 31 21:03:08 CEST 2012

Source code in Latin-1 that used non-ASCII (>=127) bytes would be
invalid UTF-8.

On Tue, Jul 31, 2012 at 3:05 AM, Joe Armstrong <erlang@REDACTED> wrote:
> Is "encoding(...)"  a good idea?
>
> There are four reasonable alternatives
>
>     a) - all files are Latin1
>     b) - all files are UTF8
>     c) - all files are Latin1 or UTF8 and you guess
>     d) - all files are Latin1 or UTF8 or anything else and you tell
>
> Today we do a).
>
> What would be the consequences of changing to b) in (say) the next
> major release?
>
> This would break some code - but how much? - how much code is there
> with non Latin1 printable characters
> in string literals? - it should be easy to write a program to test for
> this and flag sting literals that
> might causes problems if the default convention was changed.
>
> /Joe
>
>
>
> On Tue, Jul 31, 2012 at 12:44 AM, Richard O'Keefe <ok@REDACTED> wrote:
>> The thing that puzzles me about Erlang assuming that source files are in
>> Latin 1 is that I have a tokenizer for Erlang that assumes Latin 1 and
>> in every Erlang/OTP release I've checked there has been at least one
>> file it tripped up on because of UTF-8 characters.
>>
>> When can we expect -encoding('whatever'). to be supported?
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions