[erlang-questions] Obsolete exported functions file:raw_{read, write}_file_info/2 - why?

Joseph Norton norton@REDACTED
Thu Sep 15 12:25:57 CEST 2011


Zabrane -

Based on benchmarking results, I believe prim_file:read_file is a better alternative for performance.   The other reason the prim_file approach is better is it should prevent callers with file i/o to different disks and/or different disk controllers from impacting each other.  We have seen a case in production where failure of one disk crashed erlang processes doing file i/o to an unrelated disk.  The root cause (not yet repeated and not yet proven though) is the singleton file i/o server process.

Nevertheless, I was hoping to learn from my original post why the file module hasn't been written (or re-written) to provide raw support for other operations beyond just open.  My only guess is that file i/o hasn't been a bottleneck for most Erlang applications.

thanks,

Joseph Norton
norton@REDACTED



On Sep 15, 2011, at 6:36 PM, Zabrane Mickael wrote:

> Thanks  for sharing Joseph.
> 
> One more question:
> Is the call prim_file:read_file/1 a better alternative to file:read_file/1?
> 
> Regards,
> Zabrane
> 
> On Sep 15, 2011, at 11:25 AM, Joseph Norton wrote:
> 
>> 
>> Hi.
>> 
>> I'm working on a patch for the file.erl module itself.  In the meantime, see the attached module as a working example.  We have been using this approach for benchmarking purposes since early summer.  The performance difference is dramatically better than the default file implementation.
>> 
>> thanks,
>> 
>> Joseph Norton
>> norton@REDACTED
>> 
>> <basho_bench_erlang_file_alternative.erl>
>> On Sep 15, 2011, at 1:57 AM, erlang wrote:
>> 
>>>> I recommend reading the file.erl source.  It's quite instructive to see
>>>> how many file I/O functions are redirected to the 'file_server_2'
>>>> process.  For file I/O-intensive applications (e.g. Hibari and Riak I
>>>> know, CouchDB and RabbitMQ I'd guess), having all calls to(*)
>>>> file:read_file_info/1 serialized by the file server process is a source
>>>> of latency that we (DB authors) may desire to live without.
>>> 
>>> How did you proceed to avoid these calls and reduce latency Scott?
>>> Any hints?
>> 
> 
> 




More information about the erlang-questions mailing list