Garbage collection of binaries

Shawn Pearce spearce@REDACTED
Sat Oct 18 03:14:09 CEST 2003

Perhaps add a total 'large binary' size to the process structure.

This large_bin_siz field would store the total size of all
referenced binaries (where total is the total memory footprint, not
just the subsection referenced).

During each GC the large_bin_siz field would be zeroed and resummed
while all references are being copied.  References which are not
copied would have the binary's refcnt decremented as occurs today.

Creating a new binary reference would automatically increment
large_bin_siz within the process structure, unless the new binary
reference is being formed from subsection of an existing binary

When allocating a new large binary in the global heap, the large
binary allocator examines current free memory and determines if
binary collection might need to occur to prevent heap expansion.
If so, it scans the active process list and examines the large_bin_siz
fields, requesting GC from each process which has a sufficiently large

After GC'ing all of those processes, enough large binary memory should
have been free'd.  If not, the heap expands or a global GC occurs.  Which
gets selected might be based on how successful a global GC has been in the
past few allocations to get the necessary memory.  Some apps will need
a heavy period of heap expansion, then just local GCs, while some may
need just global GCs, etc.

I'm not up on the internals of beam lately, so I don't know what effect
this might have on forcing a process to wait for others to GC, or if
the process trying to allocate the large binary is also itself requested
to GC.

Just a thought.

IMHO, this is one of Erlang's larger warts.  Its soooo easy to run
a node out of memory with binaries.  Almost to the point that it removes
much of the reason to use Erlang.  :-)

Jesper Wilhelmsson <jesperw@REDACTED> wrote:
> On Fri, 17 Oct 2003, Joachim Durchholz wrote:
> > Jesper Wilhelmsson wrote:
> > > Binaries are referenced counted and are removed when their conter reach
> > > zero.
> >
> > This gives me another approach: when you're done with a binary, zero out
> > all references to it. The binary will go away immediately.
> Well.. The only place where the reference count is decremented is in the
> GC and when a process dies.
> To manually decrement the reference count and promise the system to never
> use the address again is not a verry functional way of doing things.
> .joppe
> >
> > (This might be impossible if the reference is created as an intermediate
> > result. It might also interfere with tail recursion. Using reference
> > counts in a functional language doesn't strike me as a particularly good
> > solution in this light...)
> >
> > Regards,
> > Jo


  There are no data that cannot be plotted on a straight line if the axis
  are chosen correctly.

More information about the erlang-questions mailing list