[erlang-questions] unicode in string literals

Tue Jul 31 09:09:46 CEST 2012

On 07/31/2012 12:00 AM, Michel Rijnders wrote:
> On Tue, Jul 31, 2012 at 1:41 AM, Michael Truog <mjtruog@REDACTED> wrote:
>> On 07/30/2012 03:44 PM, Richard O'Keefe wrote:
>>> The thing that puzzles me about Erlang assuming that source files are in
>>> Latin 1 is that I have a tokenizer for Erlang that assumes Latin 1 and
>>> in every Erlang/OTP release I've checked there has been at least one
>>> file it tripped up on because of UTF-8 characters.
>>>
>>> When can we expect -encoding('whatever'). to be supported?
>> The solution with the way things are currently, is just to use modelines (within the first 3 lines of the file) which are supported in your favorite editor, vi or emacs:
>> % -*- coding: utf-8; Mode: erlang; tab-width: 4; c-basic-offset: 4; indent-tabs-mode: nil -*-
>> % ex: set softtabstop=4 tabstop=4 shiftwidth=4 expandtab fileencoding=utf-8:
>>
> Shouldn't that modeline read:
> % -*- coding: latin-1; mode: erlang; tab-width: 4; c-basic-offset: 4;
> indent-tabs-mode: nil -*-
>
> Since the compiler assumes source files are in Latin 1

I think the point was to use utf8 in the source file, thus the utf8 in the modeline.  The encoding() would be necessary for various erlang names (like functions, variables, etc.) to be in utf8, but the modeline could help keep list data as utf8.