file:read with read_ahead and binaries broken

Wed Oct 28 13:30:28 CET 2009

dd if=/dev/urandom of=/tmp/file.rnd bs=1M count=20

test(Hdl) ->
    test(Hdl, []).

test(Hdl, Acc) ->
    case file:read(Hdl, 1) of
        {ok, <<Num:1/binary>>} -> {ok, _Pos} = file:position(Hdl, {cur, 1}),
                                  test(Hdl, [Num|Acc]);
        eof -> Acc
    end.

1> f(), {ok, Hdl} = file:open("/tmp/file.rnd", [read, read_ahead, binary, raw]),
  X = test:test(Hdl), ok = file:close(Hdl).

Erlang will die. Badly. erlang:memory() shows that of the 4GB erlang
has claimed before I kill it, 3.9GB of that is binary data.

Ways to stop this going nuts:
1) Don't use read_ahead
2) Remove the position call - instead, read 2 bytes and skip the second
3) Add any random term, say 'foo' to the Acc, rather than Num.
4) Have Num as an int, not a binary.
5) Do the following:
        {ok, <<Num:8>>} -> {ok, _Pos} = file:position(Hdl, {cur, 1}),
                           <<Num2:1/binary>> = <<Num:8>>,
                           test(Hdl, [Num2|Acc]);

My guess is that what's happening is that the read is reading in a whole
disk page (as it should), Num is a pointer into the start of that page,
but the rest of the page beyond the first byte, isn't reclaimed. Then the
position seemingly invalidates the entire page. This is confirmed by the
fact that strace -f -c -p $PID shows the same number of calls to read in
both the read_ahead and non read_ahead versions. Interestingly though,
there are twice as many calls to lseek in the read_ahead version.

>From inspecting the size of the file itself, both the read_ahead and non
versions are really issuing a read for every single byte read, and the
read_ahead version also has the advantage of issuing twice as many
seeks.

A quick test shows this happens at least as far back as R12B5, and still
happens in R13B02.

Oh and if you follow suggestion (5), you'll find the read_ahead version
is about 8 times slower than the non read_ahead version.

Matthew