[erlang-questions] file:read_file an UTF-8 encoded file
Camille Troillard
lists@REDACTED
Thu Jun 26 21:46:19 CEST 2014
Here is what came to:
% Read a file as a binary, knowing the contents is UTF-8 encoded text.
read_utf8_file(Name) ->
{ok, Binary} = file:read_file(Name),
{_, Skip} = unicode:bom_to_encoding(Binary),
<<_:Skip/unit:8, Contents/binary>> = Binary,
Contents.
Thanks to all for your advices.
On 26 Jun 2014, at 18:23, Loïc Hoguin <essen@REDACTED> wrote:
> file:read_file/1 reads a file as a sequence of bytes. It doesn't know what kind of file it is, or how to interpret it. It's not file:read_text_file/1 or similar, it's just read_file. It's not high level at all, it's just a convenient shortcut.
>
> On 06/26/2014 05:58 PM, Camille Troillard wrote:
>> Hi list,
>>
>> This is a simple question, yet I haven’t found the right answer.
>> Using Erlang/OTP 16B03-2:
>>
>> I read a file using file:read_file(“my_utf8_file.txt”).
>>
>> The result binary contains the 3 BOM bytes. I was not expecting that. Since this is such a high-level call, isn’t file:read_file/1 supposed to get rid of the byte order mark?
>>
>> So, how do you professional Erlang users read the contents of a UTF-8 encoded file on Erlang 16B03?
>>
>>
>> All the bast,
>> Cam
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>
> --
> Loïc Hoguin
> http://ninenines.eu
More information about the erlang-questions
mailing list