Concatenating atoms
Joe Armstrong (AL/EAB)
joe.armstrong@REDACTED
Tue Feb 1 12:18:14 CET 2005
> Thomas Lindgren wrote
> As an alternative to discouraging developers, I'd like
> to encourage the Erlang implementation community to,
> at long last, implement an atom GC :-) (Well, I really
> do.)
>
> Best,
> Thomas
Nja - Ummm - we're garbing the wrong thing - we should be garbing
the code space and not the atom space, atoms should be local to modules
and not global at all. There should not be a global atom table in the
first place - it violates the principle of isolation.
The atom table is an efficiency lack which should never have be made.
With a little carefull re-design we could eliminate the atom table
and then no GC is required.
This would mean that each module would have to have its own
private atom table. With a little thought (little = about 10 years :-)
we could arrange that:
- atom comparisons within the same module is atomic
- atom comparisons of atoms in two different modules
is atomic the second time it is made
is a hash table lookup the first time it is made
Atoms would be represented as
(AtomTag, Pointer) -> (LocalHashTablePointer) -> Value
(RemotehashTablePointer)
ie each Atom (a tagged pointer) points to two words.
The first is a pointer to the local module hash table
The second is zero (initially) is used to cache a hint
pointer (the hint points to an atom in a remote module
which is known to be the same as the local atom) -
when two or more modules use the same atom - the numerically
lowest pointer should be used.
This would need a few more changes:
- we don't move code
- we garbage collect code (ie not have two versions)
- when code Is finally removed (by GC) then we sweep all
code spaces zeroing any cached remote hash table pointers
basically we should not garb the atom table - we should garb the code
space - and we should dynamically cache atom and function start addresses.
The idea of having two versions of code is silly anyway - we should have
N versions and garb away old versions. Atoms should not be global, but local to
individual modules and cacheable hint pointers should be used to optimise
atom comparison and function start address resolution.
Code should be first class - but probably represented by special frozen heap
objects since it is likely to hang around for a long time and moving it
would be expensive since we would have to invalidate the cached heap references
Cheers
/Joe
More information about the erlang-questions
mailing list