[erlang-questions] UTF8 and EDoc
Ngoc Dao
ngocdaothanh@REDACTED
Mon Oct 5 11:31:49 CEST 2009
This is my fix to make EDoc work with Japanese (R13B02-1, .erl and
overview.edoc files are saved in UTF-8). I think it will work for
other languages:
1. At edoc_lib:write_file/4
Change
file:open(File)
to
file:open(File, [write, {encoding, utf8}])
This is better than my previous dirty hack.
2. At edoc_tags:parse_tags/5
Change
case dict:fetch(Name, How) of
text ->
parse_tags(Ts, How, Env, Where, [T | Ts1]);
to
case dict:fetch(Name, How) of
text ->
Data = unicode:characters_to_list(list_to_binary(T#tag.data)),
T2 = T#tag{data = Data},
parse_tags(Ts, How, Env, Where, [T2 | Ts1]);
Regards,
Ngoc
On Wed, Sep 30, 2009 at 6:37 PM, Richard Carlsson
<carlsson.richard@REDACTED> wrote:
> Ngoc Dao wrote:
>> When I use EDoc library in Erlang R13B02-1 to create document with
>> Japanese characters in the doc comments, there is error:
>
> Yes, this is a known problem. The short answer is that the input
> encoding for Erlang source code is defined to be Latin-1. That is,
> if you put things like Japanese or Russian characters in the
> source files, you are breaking the rules to begin with. (If it's only
> in comments, and using UTF-8, it will not prevent the compiler from
> skipping the comments and compiling the program, but you can't
> expect anything else to work.)
>
> What would be needed is something like a \u-escaping preprocessing
> stage, as specified for Java. But then, the tools must also know
> about \u escape sequences and turn them back into the proper code
> point in UTF-8 or whatever.
>
> /Richard
>
More information about the erlang-questions
mailing list