The real reason why file:read_file in R7B guzzles memory

Robert Virding rv@REDACTED
Mon Nov 20 11:51:41 CET 2000


matthias@REDACTED writes:
>Hi,
...
>Removing the bug
>
>   One way is to change the clause in io_lib_pretty:write_length 
>   for binaries to
>
>    write_length(Bin, D, Acc, Max) when binary(Bin) ->
>       Acc;
>
>   I'm not sure what the 'depth' rules are, but the result
>   looks reasonable. The old clause was
>
>    write_length(Bin, D, Acc, Max) when binary(Bin) ->
>       Acc + length(io_lib:write(Bin));
>
>   This performs badly because it causes io_lib:write/1 to
>   generate a list representing the whole binary. In R6 it
>   performed ok because binaries were always written as a short
>   string.

You have found the right place, but have misinterpreted the write_length/
1 function.  It must return the total length needed to write a thing but
it also has the maximum allowed length included as an argument so that 
it can bail early.  The calling function uses this length to determine 
if the the whole thing will fit on the current line or if it must be 
broken up into separate lines.

%% write_length(Term, Depth, Accumulator, MaxLength) -> integer()
%%  Calculate the print length of a term, but exit when length becomes
%%  greater than MaxLength.

By returning just the accumulator you have in fact stated that printing 
a binary takes zero characters, which is not quite correct.  Try 
printing

io:fwrite("~P", [{ok, Bin, 3}, 20]).

to see the difference.

A quick and dirty solution would be to claim that binaries always take 
lots of space and always fill up the available space:

write_length(Bin, D, Acc, Max) when binary(Bin) ->
    Max;

A better solution would probably be something like:

write_length(Bin, D, Acc, Max) when binary(Bin) ->
    if  D < 0 ->                                %Used to print all
            Acc + 4 + 4*size(Bin);
        true ->
            Acc + 4 + 4*(D+1)
    end;

which is fast but accurate enough.  The existing binary code is actually
WRONG, which I can comfortably say having written everything except the
binary stuff. :-)

This is actually a general problem with formatted printing, you have to 
print it to know what it looks like before you know how to print it.

	Robert





More information about the erlang-questions mailing list