Working with large binaries in the interpreter

Bob Ippolito bob@REDACTED
Fri Sep 1 21:39:54 CEST 2006


I'm writing an application that has a large (database) file that it
keeps as a binary in memory (24.54M). Normally I would use something
like mmap for this purpose, but I haven't found a similar facility in
Erlang.

The problem is whenever I get a traceback with that binary on the
stack or otherwise end up with a printed representation of the binary
at the interpreter, memory usage grows enormously and it doesn't go
down. If this happens more than once, I start swapping. Is there
anything I can do about this? It's really difficult to debug when I
have to kill beam at any error.

Also, the binary size seems suspiciously high. There should be exactly
one 25737850 byte (24.54M) binary, but there's 16271355 bytes (15.5M)
unaccounted for. It should not be a view on a larger binary, because
that binary is the entire file uncompressed.

(This is Erlang/OTP R11B-0 on Mac OS X 10.4 intel)

Erlang (BEAM) emulator version 5.5 [source] [async-threads:0]

Eshell V5.5  (abort with ^G)
1> erlang:memory().
[{total,2799855},
 {processes,346118},
 {processes_used,340190},
 {system,2453737},
 {atom,213345},
 {atom_used,197118},
 {binary,62460},
 {code,1594800},
 {ets,108032}]
2> {ok, D} = egeoip:new(), garbage_collect().
true
3> erlang:memory().
[{total,44853447},
 {processes,315186},
 {processes_used,309258},
 {system,44538261},
 {atom,216649},
 {atom_used,201683},
 {binary,42071665},
 {code,1664475},
 {ets,110372}]
4> 42071665 - 62460 - size(element(5, D)).
16271355
5> D. garbage_collect().
{geoipdb,2,
         3,
         3576103,
         <<1,0,0,107,0,0,2,0,0,60,0,0,3,0,0,30,0,0,4,0,0,18,0,...>>,
         "/Users/bob/src/egeoip/priv/GeoLiteCity.dat.gz"}
6> garbage_collect().
true
7> erlang:memory().
[{total,489893135},
 {processes,445361822},
 {processes_used,445355894},
 {system,44531313},
 {atom,216649},
 {atom_used,201683},
 {binary,42064717},
 {code,1664475},
 {ets,110372}]

Why did processes grow by 445046636 bytes (424.43M) after printing
that representation?!

-bob



More information about the erlang-questions mailing list