[erlang-patches] Added native OS file system copy to efile

Blaine Whittle BWhittle@REDACTED
Wed Oct 6 18:24:06 CEST 2010


Understood...  

This change along led to a 30% performance increase for our app (yes we do a bit file coping).  The reason for implementing the OS copy in efile (or a standalone driver) instead of a NIF is so that async threads using the +A command line option (which don't block the scheduler).  I implemented this in efile as the prim_file.erl comment suggested that there was a preference for a driver implementation of copy.

What are your thoughts about adding a check to see if Async threads are enabled and returning enotsup otherwise? 

On semi unrelated note, I'm about to implement a function to quickly verify if a file exists.  At the moment, I'm just using prim_file:read_file_info/1 for this.  The problem is that read_file_info is a little too heavy for this simple check as it collects extra info about the file.  This means that if a file exists, the file existence check takes 2 or 3 times longer than when the file doesn't exist.  For most cases, I'm sure this is fine but when you are doing a few million of checks, it adds up.  

If there is any interest in having this as part of the efile API, I can add this function to efile + the posix and win32 implementations (fyi there already is a efile_may_openfile function which is close to what I want, the problem is that it's not exposed to prim_file), otherwise I'll implement a light weight file exists check as a NIF.
 


-----Original Message-----
From: Björn Gustavsson [mailto:bgustavsson@REDACTED] 
Sent: Wednesday, October 06, 2010 6:45 AM
To: Blaine Whittle
Cc: erlang-patches@REDACTED
Subject: Re: [erlang-patches] Added native OS file system copy to efile

On Tue, Oct 5, 2010 at 9:33 PM, Blaine Whittle <BWhittle@REDACTED> wrote:
> prim_file:copy has a comment that reads " xxx Should be moved down to the driver for optimization".  This patch does just that, however to ensure compatibility I haven't changed the existing implementation of prim_file:copy/3;  I've added two new functions prim_file:raw_copy/2, and prim_file:raw_copy/3 which perform the copy in the driver.
>
> git fetch git://github.com/bwhittle/otp.git prim_file_copy
>

As currently written, copying a huge file will block
the thread for entire duration of the copy operation.
That is not OK.

If the copy function is to be moved to the driver,
there must be a limit to the number of bytes that
the driver is allowed to copy at one time. When
that number has been exceeded, the driver should
return control to the Erlang process to allow other
Erlang processes to be run. That was the original
idea behind the comment, but it is not really
clear that it would improve performance much.

Do you have a real need for a faster file copy
operation or is it just nice to have?

-- 
Björn Gustavsson, Erlang/OTP, Ericsson AB


More information about the erlang-patches mailing list