[erlang-questions] effect of destructive updates on GC implementation

Tue Jan 29 19:16:53 CET 2008

I read a lot on this list and elsewhere about how destructive updates
would complicate Erlang's garbage collector. I want to argue that if
you have controlled destructive updates, the added GC complexity is
small. My larger goal is to understand the issues involved in
introducing destructive updates into Erlang in a harmonious way. (As
opposed to something like ets tables, which have a number of
limitations and quirks that seem to follow from the difficulty of
integrating them with the GC.)

I'll proceed from a basic starting point, to flush out any bugs in my
understanding and to bring newbies up to speed. In a generational GC
like Erlang's, you have (at least) two generations of objects, young
and old. Since most garbage is young, you speed up GC by collecting
only the young generation. Only when that fails to give you enough
space do you need to do a full GC of young and old.

The crucial observation that allows you to examine only the young
generation when collecting it is that there are no pointers from old
objects to young. Thus you need not fear that a young object,
seemingly garbage from your examination of all and only the young
objects, is in fact referenced by some old object. A purely functional
language guarantees this no-old-to-young property, because (a) it is a
fact of logic that a newly created object can only contain pointers to
objects older than itself, and (b) you can't update, so you can't
change that invariant. My understanding is that though Erlang is not
purely functional, it is from the point of view of its GC, because
destructive updates (to ets tables and process dictionaries) occur
outside of the GCable heap.

It is possible to modify a generational GC to handle updates, by doing
two things. First, you must examine all updates to see if they
introduce an old-to-young pointer. This is called a *write barrier*.
Second, you must keep track of all old-to-young pointers, so you can
add them to your root set when GCing the young generation. In general,
the first of these is by far the more difficult, because you must
efficiently examine every destructive pointer update to determine if
it introduces an old-to-young relationship.

But write barriers are only difficult because they have to be very
efficient, and that is because standard imperative languages have many
updates. If there were relatively few updates,  there would be no need
for a complex write barrier. For instance, consider a feature like ML
references. A reference is a single updatable memory cell, with its
own special update operation. If the update operation compiles into
its own VM instruction, then the write barrier is just a simple check
in the implementation of that instruction. Since references are
cumbersome, the language encourages a purely functional style;
destructive updates are clearly marked syntactically, so they are ugly
and hence rare.

By making an updatable cell its own object (by which I just mean data
structure), we simplify the second required GC modification, keeping
track of the old-to-new pointers. A simple implementation can doubly
link the old-to-new cells, enabling quick scanning for GC and
constant-time removal from the list when the cell itself becomes
garbage. (Of course, more space-efficient implementations are
possible.)

I neglected to mention one other thing you need, a quick way to tell
the generation of an object. In a copying GC, you can dedicate each
heap page to a generation, and either keep bits on the page or have a
map from page addresses to generations. The high-order bits of an
object's address give you its page.

I haven't looked at the code for the Erlang GC, so I don't know how
amenable it is to these changes. I just wanted to dispel some of the
fear I imagine I hear when people intone "write barrier." Or maybe I
am missing something, in which case I would appreciate being educated.