[erlang-bugs] escript results in disk_log space leak in R12B4
anthony.shipman@REDACTED
anthony.shipman@REDACTED
Thu Aug 25 06:18:36 CEST 2011
I have a simple escript that dumps the contents of a disk_log file. It uses
disk_log:chunk() to iterate through the records and prints them. The memory
usage climbs dramatically on a large file.
The loop in the program is
readLog(Cont) ->
case disk_log:chunk(log, Cont) of
eof ->
ok;
{error, Reason} ->
error("when reading the log: ~p", [Reason]),
ok;
{Cont2, Terms} ->
dumpTerms(Terms),
%util:debug("Cont2 ~p\n", [Cont2]),
readLog(Cont2);
{Cont2, Terms, BadBytes} ->
dumpTerms(Terms),
warning("~p bad bytes were skipped", [BadBytes]),
readLog(Cont2)
end.
The Cont term just contains terms like
Cont2 = {continuation,<0.34.0>,65408,271}
I've put in some statistics tracking code. For these tests dumpTerms() is a
dummy function. Early in the run I have
<0.29.0>: [{registered_name,[]},
{initial_call,{erlang,apply,2}},
{current_function,{disk_log,bins2terms,2}},
{memory,2057284},
{heap_size,514229},
{stack_size,66},
{total_heap_size,514229},
{binary,[{3078463584,65536,287},
{3078463584,65536,287},
....... 286 of these lines
{3078463584,65536,287},
{3078463584,65536,287}]},
{dictionary,[]},
{garbage_collection,[{fullsweep_after,1},{minor_gcs,0}]}]
<0.34.0>: [{registered_name,[]},
{initial_call,{proc_lib,init_p,5}},
{current_function,{disk_log,loop,1}},
{memory,17212},
{heap_size,4181},
{stack_size,7},
{total_heap_size,4181},
{binary,[{3078463584,65536,287}]},
{dictionary,
[{log,
{log,ok,log,none,0,"../logs/diag_archive_2011-08-05.log",
[{<0.29.0>,false}],
halt,internal,halt_int,none,read_only,infinity,
{halt,
{cache,
{file_descriptor,prim_file,{#Port<0.101>,8}},
0,[]},
709715947,infinity},
undefined}},
{'$ancestors',
[disk_log_sup,kernel_safe_sup,kernel_sup,<0.8.0>]},
{'$initial_call',{disk_log,init,[<0.32.0>,<0.33.0>]}}]},
{garbage_collection,[{fullsweep_after,1},{minor_gcs,0}]}]
This makes sense. The program's loop has sliced a 64K binary from the disk_log
server into many records. A little later the address of the binary is
different. This also makes sense. A GC occurred to clean out the old binaries
and only the most recently read binary appeared in the stats. But much later
I have
<0.29.0>: [{registered_name,[]},
{initial_call,{erlang,apply,2}},
{current_function,undefined},
{memory,154539648},
{heap_size,8024355},
{stack_size,1970},
{total_heap_size,38634820},
{binary,[{2988208416,65536,310},
{2988208416,65536,310},
...............
{3083842192,65536,254},
...............
{2987487256,65536,283},
lots of entries from 22 different binaries
...............
{dictionary,[]},
{garbage_collection,[{fullsweep_after,1},{minor_gcs,1}]}]
<0.34.0>: [{registered_name,[]},
{initial_call,{proc_lib,init_p,5}},
{current_function,{disk_log,loop,1}},
{memory,17212},
{heap_size,4181},
{stack_size,7},
{total_heap_size,4181},
{binary,[{2988995136,65536,319}]},
{dictionary,
[{log,
{log,ok,log,none,0,"../logs/diag_archive_2011-08-05.log",
[{<0.29.0>,false}],
halt,internal,halt_int,none,read_only,infinity,
{halt,
{cache,
{file_descriptor,prim_file,{#Port<0.101>,8}},
0,[]},
709715947,infinity},
undefined}},
{'$ancestors',
[disk_log_sup,kernel_safe_sup,kernel_sup,<0.8.0>]},
{'$initial_call',{disk_log,init,[<0.32.0>,<0.33.0>]}}]},
{garbage_collection,[{fullsweep_after,1},{minor_gcs,0}]}]
The memory usage of the readLog() process has gone up to 150MB and the stack
is growing.
I thought that there was some serious problem with the GC until I finally
remembered to try -mode(compile) and the problem went away. This is a trap in
the behaviour of the escript interpreter. Something in the interpreter is not
properly tail-call optimized.
--
Anthony Shipman | What most people think about
Anthony.Shipman@REDACTED | most things is mostly wrong.
More information about the erlang-bugs
mailing list