Concatenating atoms

Richard A. O'Keefe ok@REDACTED
Fri Feb 4 03:33:30 CET 2005


Let me offer another perspective on atom GC.

During the time that I worked on it, Quintus Prolog did not have an atom GC.
The representation for atoms was a bunch of tag bits here and there saying
"I am an atom" plus a 21-bit field which indexed an extensible array of
packed strings (the names).

This meant that QP could have a maximum of 2,097,152 atoms in any one image.
But hey, our fancy new Sun 3/50 machines had a massive 4MB of memory, and
we were sure to run out of other stuff before filling up the table...

OK, so Sun 3/50 machines had virtual memory.  So one day I wondered how long
it would take to run out of atoms.  That's when I discovered that our single
global atom hash table was most unfriendly to VM; once we got past a certain
number we could only create 6 atoms per second, doing nothing else but make
atoms.  (See, paging over the network _was_ a bad idea!)  So we replaced the
atom hash table with something that was *much* kinder to the VM, and never
did fill up the atom table.

But now I have 640 MB on my desktop UltraSPARC and 640MB on my G4 Mac.
The atom table could fill up quite quickly.

Parkinson's law of computer memory:  no computer _ever_ has enough memory.

There are two points to this anecdote:

(a) A limit which you are *never* going to reach in practice this year
    will be ridiculously small in 10 years time.

(b) Running into the limit isn't the only problem.  Hanging on to lots of
    atoms you don't need any more can give you performance problems even
    if you aren't even close to the limit.

Actually there's a third point:

(c) Not having an atom GC can cost you customers.  There were people who
    _would_ have used QP but who wanted to hook it up with gigantic databases
    and pump large numbers of atoms through our system.  Those were sales
    we _didn't_ make.




More information about the erlang-questions mailing list