[erlang-questions] UTF8 and EDoc

Mon Oct 5 21:40:45 CEST 2009

how can i convert the erlang files to c source code?
Thanks
Sumit

2009/10/6 Tomas Abrahamsson <tomas.abrahamsson@REDACTED>

> > Ngoc Dao wrote:
> >> When I use EDoc library in Erlang R13B02-1 to create document with
> >> Japanese characters in the doc comments, there is error:
>
> Richard Carlsson wrote:
> > Yes, this is a known problem. The short answer is that the input
> > encoding for Erlang source code is defined to be Latin-1. [...]
> > What would be needed is something like a \u-escaping preprocessing
> > stage, as specified for Java. But then, the tools must also know
> > about \u escape sequences and turn them back into the proper code
> > point in UTF-8 or whatever.
>
> An option could be to adopt the way it is done in Python:
> it (re)uses the editor's encoding declaration. If it finds the text
>   -*- coding: utf-8 -*-  or  vim: set fileencoding=utf-8 :
> on the first or second line of the source file, then it sets
> the encoding for the entire source file accordingly. (It also
> understands unicode byte-order marks at the beginning
> of the file, which apparently makes life easier in editors
> on Windows.)
>
> See http://www.python.org/peps/pep-0263.html for details.
>
> An advantage with this scheme seems to be that it fits nicely
> with editors. They already know how to handle this.
>
> It would probably require the Erlang compiler, edoc, and other tools
> to be modified to know about source file encodings, though.
>
> I suppose that with the \u-escaping, existing tools would continue
> to work without modification, but it would be more work for the
> programmer to type the text in as \u-seqences, unless editors
> already know how to do such a transformation on the fly?
>
> If no such encoding declaration is found, Python assumes ASCII,
> but Erlang could maybe assume Latin-1. If Python finds non-ASCII
> characters in a file with no encoding declaration, then it spits
> out an error like this (wrapped for readability):
>
>  prompt# python /tmp/x.py
>    File "/tmp/x.py", line 3
>  SyntaxError: Non-ASCII character '\xe5' in file /tmp/x.py on line 3,
>  but no encoding declared; see http://www.python.org/peps/pep-0263.html
>  for details
>  prompt# cat /tmp/
>  #! /usr/bin/env python
>
>  print 'åäö'
>
> BRs
> Tomas
>
> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org
>
>