[erlang-questions] Request for enhancement: Sparse files

Richard O'Keefe <>
Thu Jun 11 01:34:37 CEST 2009

On 10 Jun 2009, at 8:43 pm, Ville Silventoinen wrote:

> Hi Richard,
> Thank you for the reminders. I need to do two things: Scan big
> filesystems (billions of files, petabytes of data) and copy them. Both
> of these I've implemented in Erlang, but while doing so I've found
> some limitations in Erlang file module (some may remember my earlier
> post about file:write_link_info). If you read my original email, I
> have requested two things:
> 1) Adding st_blkcnt and st_blksize information to the file_info  
> record.
> 2) Support for sparse files in file:copy if possible.

In brief, I was _supporting_ you.
(1) These things are standard across modern UNIX systems and
     can be trivially (and reasonably) faked on Windows.
(2) You can write a file copy procedure in C that will avoid
     allocating all-zeros pages WITHOUT even knowing that the
     block size is, so it's not unreasonable for a file-copy
     function to do this.

There are other properties of a file that one might wish to preserve.
(A) Some versions of UNIX have access control lists, Solaris for one.
     Copying the contents of a file arguably needn't copy this info.
(B) Some versions of UNIX have extended attributes, MacOS X, Solaris,
     and I think the bttrfs for Linux do this.  Copying the _meaning_
     of the contents of a file arguable _does_ require copying this.
     And of course Windows has extended attributes too.

It seems that ANYONE copying a file needs a file copying method
that is about as capable as the Mac OS X copyfile(3) function.

Come to think of it, it would be rather nice to have versions
of copyfile(3) for Linux and Solaris.

In the mean time, of course, for any particular UNIX flavour,
you can shell out to cp(1).

More information about the erlang-questions mailing list