Concatenating atoms

Joe Armstrong (AL/EAB) joe.armstrong@REDACTED
Tue Feb 1 15:23:27 CET 2005


 
> 
> > > Thomas Lindgren wrote
> > > [atom gc!]
> >
> >   Nja - Ummm - we're garbing the wrong thing - we
> > should be garbing
> > the code space and not the atom space, atoms should
> > be local to modules
> > and not global at all. 
> ...
> > There should not be a global
> > atom table in the
> > first place - it violates the principle of
> > isolation.
> 
> Violates it in what way?

Things are isolated if they do not share anything.

Modules (after loading) share pointers into hash tables
Which is *evil* 

> 
> The serious issue about atoms as implemented today, in
> my mind, is that (a) you may run out of them, (b) once
> you have created one, you can never get rid of it.
> (Except by restarting the VM.)
> 
> > [atoms should be put in per-module atom tables]
> 
> How would dynamically created atoms be handled?
> 

  Put them on the local heap and compare each time :-)

  I'm not even sure if having an atom table etc. saves much time.
Suppose we did atom comparisons the hard way, testing *each* time
that the data in the atoms was identical how long would this take?

  On a 32 bit machine this might involve comparing 2-3 words for
equality.

  +---------+             +--------+
  | tag  ---|-----------> | Length |
  +---------+             +--------+
                          | chars  |
                          +--------+
		              | ...    |
			        +--------+

  An 8 character atom would involve comparing 3 words for equality etc.
Since you get good locality of refence I suspect this is a pretty efficient operation

  Various optimisations are possible (for example using a crc32 checksum
for long atoms etc)

  On a 64 bit machine things become interesting
  
  Assume (short) characters are restricted to a-zA-Z0-9_- (64 possible characters) 

  Each (short) character takes 6 bits so in 64 bits we could pack
say 4 tag bits + 10 characters.

  Now we could make two types of atoms (short and long) short atoms
must be <= 10 short characters.

  This would be nice because longSillyAtomNamesThatAreTotallyUnreadable would
be less attractive :-)  
  

/Joe
 

> > [code should be garbage collected]
> 
> I think I agree with this, regardless of the treatment
> of atoms, but I'd probably want a mechanism to manage
> the various loaded module versions too.(*)
> 
> In particular, the great advantage of the current code
> model ("as you know, Bob" :-) is never getting space
> leaks due to obsolete module versions hanging around.
> A requirement should be that new schemes not be too
> vulnerable to that either.
> 
> By the way, that multithreaded Erlang had code GC,
> didn't it? Were there any lessons on how well it
> worked?
> 
> Best,
> Thomas
> 
> (*) Actually, I want a lot of things when it comes to
> modules :-)
> 
> 
> 
> 		
> __________________________________ 
> Do you Yahoo!? 
> Yahoo! Mail - Find what you need with new enhanced search.
> http://info.mail.yahoo.com/mail_250
> 



More information about the erlang-questions mailing list