[erlang-questions] Reference counting instead of GC?

Fri Jul 4 02:19:20 CEST 2008

Many years ago I did an Erlang implementation based on a reference
counting GC (it *is* gc even if not a mark-sweep or copying). Some
comments based on that work (no specific order):

- It worked very well and was reasonably efficient.
- One interesting property was that it reclaimed data *fast* so the
memory footprint was relatively small.
- Even though each object may be larger as there is only one copy then
memory usage is often less than for a copying collector.
- Freeing large unused structures is no problems, there well-known are
simple techniques for doing that in a non-blocking way.
- My work was done on a uni-processor so I have no experience with
doing it on a multi-core. I have seen that there are papers written on
the subject but I have not studied them.
- To get speed you have to be cunning at all levels so it is difficult
to just take the BEAM and hack in a different GC, especially one which
is so different.
- The biggest problem is probably the sheer work involved in doing
enough of an implementation to test it. As mentioned in previous
comment you have to redo basically all memory allocation everywhere.
- Erlang is suited for reference counting as it has no circular
structures, well it has a few circular references but they are only in
well-known places so no real problems.
- It was a fun memory system. Seeing all the reference counts drop to
0 at the end of a run was very satisfying. :-)
- Shared memory between processes at the *implementation level* is no
problem as the application never sees it, it can only make copies.

I really liked it, one of these days I will do another.

Robert

On 03/07/2008, Alpár Jüttner <alpar@REDACTED> wrote:
> On Thu, 2008-07-03 at 07:41 -0700, Thomas Lindgren wrote:
>
>> Good luck :-) However, note that the hipe guys have over the years
>>  proposed garbage collectors with most of the good properties you
>>  mention. (Cf the "shared" and "hybrid" emulators. E.g., "erl
>>  -hybrid".)
>
> Could you tell me where can I find some info about these modes? The erl
> manual doesn't even list the -shared and the -hybrid switches.
>
>> Also note that simple reference counting does not guarantee real time
>>  collection or bounded or even short pauses. Consider the case when you
>>  drop a big term: a reference counting implementation must decrement
>>  the reference counts of all subterms (recursively), while a copying
>>  collector won't even visit the dead data.
>>  Because the dead term can
>>  have arbitrary size, the pause while it is traversed is not bounded.
>
> Yes but at least it is a deterministic and predictable time you can
> calculate with.
>
>>  (But you may be thinking of more sophisticated variants than that.)
>
> Yes indeed, the process of dropping the unused terms can be safely
> interrupted.
>
> In fact, my idea was to have a special (low priority) process in the
> emulator taking care of the unused terms. In this way the dropping of
> big data could be automatically postponed if there are "more urgent"
> things to do.
>
> Best regards,
> Alpar
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>