[erlang-questions] Request for enhancement: Sparse files
Tue Jun 9 12:11:06 CEST 2009
What do you mean by application logic? I can write an Erlang driver
that gets the actual block size of a file, is that then part of an
I would think the blocks information would be useful to Unix, Linux
and Mac users who have to deal with files.
How is providing two more fields in file_info record non-portable?
What needs porting? You can just ignore them on Windows (until they
decide to become Unix-based as well ;-), or don't provide the fields
at all (like Python stat).
On Tue, Jun 9, 2009 at 10:26 AM, Alex Arnon<> wrote:
> Can you tell where actual data blocks reside, using application logic?
> (I know this might not be solving your problem, but introducing such
> nonportable features into the standard distribution probably has small
> chance of being incorporated).
> On Mon, Jun 8, 2009 at 6:00 PM, Ville Silventoinen
> <> wrote:
>> I've written a directory scanner in Erlang that calculates usages per
>> Unix user and group. It works well, except that the results are wrong
>> when it encounters sparse files (holes in the files):
>> http://en.wikipedia.org/wiki/Sparse_file. We have some users that have
>> files that seem to be 1-2 terabytes, but in reality occupy less than a
>> gigabyte on disk.
>> The C stat struct has two fields I'd need (shown by "stat" command in
>> blksize_t st_blksize; /* optimal I/O size */
>> blkcnt_t st_blocks; /*allocated 512-byte blocks */
>> If st_size > st_blocks * 512, the file is sparse. Unfortunately, the
>> read_file_info/read_link_info doesn't provide the blocks information.
>> Any chance this could be included in some future Erlang release?
>> Also, any chance file:copy would support copying sparse files? :-) I
>> tested, the target file becomes non-sparse.
>> Erlang has been great help in our environment, where simple rsync has
>> become too slow...
>> P.S. I tried to add blocks information to my R13B Erlang environment,
>> but I broke the build system somehow (something to do with the fact
>> that prim_file is preloaded? "make preloaded" got me a bit further,
>> but I gave up when compile:compile/3 became undef). Below are the
>> changes I made to otp_src_R13B sources:
>> # diff efile_drv.c efile_drv.c.original
>> < put_int32(d->info.block_size, &resbuf[1 + (29 *
>> < put_int32(d->info.blocks_high, &resbuf[1 + (30 *
>> < put_int32(d->info.blocks_low, &resbuf[1 + (31 *
>> < #define RESULT_SIZE (1 + (32 * 4))
>> > #define RESULT_SIZE (1 + (29 * 4))
>> # diff erl_efile.h erl_efile.h.original
>> < Uint32 block_size; /* Optimal I/O size. */
>> < Uint32 blocks_low; /* Allocated 512-byte blocks,
>> lower 32 bits. */
>> < Uint32 blocks_high; /* Allocated 512-byte blocks,
>> higher 32 bits. */
>> # diff unix_efile.c unix_efile.c.original
>> < pInfo->blocks_high = 0;
>> < pInfo->blocks_high = (Uint32)(statbuf.st_blocks >> 32);
>> < pInfo->blocks_low = (Uint32)statbuf.st_blocks;
>> < pInfo->block_size = (Uint32)statbuf.st_blksize;
>> # diff prim_file.erl prim_file.erl.original
>> < [Mode, Links, Major, Minor, Inode, Uid, Gid, Access|Tail4] = Tail3,
>> < [BlockSize, HighBlocks, LowBlocks] = Tail4,
>> < Blocks = HighBlocks * 16#100000000 + LowBlocks,
>> > [Mode, Links, Major, Minor, Inode, Uid, Gid, Access] = Tail3,
>> < gid = Gid,
>> < block_size = BlockSize,
>> < blocks = Blocks}.
>> > gid = Gid}.
>> # diff file.hrl file.hrl.original
>> < gid :: integer(), % Group id for owner.
>> < block_size :: non_neg_integer(), % On Unix, optimal I/O
>> < blocks :: non_neg_integer()}). % On Unix, allocated
>> 512-byte blocks.
>> > gid :: integer()}). % Group id for owner.
>> erlang-questions mailing list. See http://www.erlang.org/faq.html
>> erlang-questions (at) erlang.org
More information about the erlang-questions