Nothingness

Thu Oct 25 10:39:10 CEST 2001

On Thu, 25 Oct 2001, Bengt Kleberg wrote:

Bengt> > Date: Wed, 24 Oct 2001 16:35:14 -0700
Bengt> > From: Erik Pearson <erik@REDACTED>
Bengt> ...deleted
Bengt> 
Bengt> > However, this seems at first blush inefficient and cumbersome. Is this 
Bengt> > true? Perhaps atoms are stored very efficiently? (Of course, the tag 
Bengt> > atoms could be made shorter as well)
Bengt> 
Bengt> Atoms are stored efficiently.
Bengt> Atoms use the same space independent upon their length.

This holds within an Erlang node and to some extent when atoms are
sent between nodes, due to the built-in atom cache used for
distributed Erlang.

In the external format for atoms, the entire "name" of the atom needs
to be stored as it does not suffice to store only the atom number.
The internal atom numbers are only valid within one Erlang node and
they may be re-calculated after node restart.

The external format is used by Erlang/OTP for file storage (dets,
disk_log, disk resident Mnesia tables etc.) but is also convenient for
other purposes when an arbitrary Erlang term needs to be externalized.

Here you can see the external size of 'undefined' vs. 'u':

  % erl
  Erlang (BEAM) emulator version 5.1 [threads:0]

  Eshell V5.1  (abort with ^G)
  1> term_to_binary(undefined).
  <<131,100,0,9,117,110,100,101,102,105,110,101,100>>
  2> term_to_binary(u).        
  <<131,100,0,1,117>>
  3> size(term_to_binary(undefined)).
  13
  4> size(term_to_binary(u)).
  5
  5> size(term_to_binary([])).
  2
  6> 

If you get performance problems, there are most likely other things in
your program that causes the trouble and not the length of atom names.
The handling of atoms in Erlang is very efficient and you should stick
with mnemonic atom names.

/Håkan

---
Håkan Mattsson
Ericsson
Computer Science Laboratory
http://www.ericsson.com/cslab/~hakan/