[erlang-questions] UTF8 and EDoc

Cameron Kerr ckerr@REDACTED
Tue Oct 6 00:31:01 CEST 2009


Tomas Abrahamsson wrote:
> Richard Carlsson wrote:
>> Yes, this is a known problem. The short answer is that the input
>> encoding for Erlang source code is defined to be Latin-1. [...]
>> What would be needed is something like a \u-escaping preprocessing
>> stage, as specified for Java. But then, the tools must also know
>> about \u escape sequences and turn them back into the proper code
>> point in UTF-8 or whatever.
>>     
>
> An option could be to adopt the way it is done in Python:
> it (re)uses the editor's encoding declaration. If it finds the text
>    -*- coding: utf-8 -*-  or  vim: set fileencoding=utf-8 :
>   
There is already a way to indicate whether something is UTF-8 (or 
UTF-16BE or UTF-16LE for that matter), and that is a byte-order mark; 
although the BOM serves no useful byte-ordering semantic for UTF-8, it 
does also have the function of saying "hey, I'm UTF-8!", a message which 
numerous programs understand.



More information about the erlang-questions mailing list