[erlang-questions] error_logger and the perils of asynchronicity

Ulf Wiger ulf.wiger@REDACTED
Thu May 14 10:38:08 CEST 2009

I recently had reason to stare at a crash dump from a
system that had run out of memory.

Mon May 11 20:31:02 2009
Slogan: eheap_alloc: Cannot allocate 191315400 bytes of memory (of type 
System version: Erlang (BEAM) emulator version 5.6.5 [source] [smp:8] 
eads:10] [kernel-poll:true]

In other words, we ran out of memory as the garbage collector
tried to find a place to allocate 191 MB of heap space.

Looking further down in the crash dump (searching for "Garbing"),
we find the "guilty" process:

State: Garbing
Name: error_logger
Spawned as: proc_lib:init_p/5
Last scheduled in for: io_lib_pretty:print_length_list1/3
Message queue length: 219
Stack+heap: 38263080
OldHeap: 47828850
Heap unused: 13858
OldHeap unused: 47828850
Program counter: 0xb13861a8 (io_lib_pretty:print_length_list1/3 + 4)
CP: 0x00000000 (invalid)

Surprisingly, it is the error_logger in this case,
which sports a hefty 38 MB of new heap and 48 MB of
old heap. It also has 219 messages in the message queue.
We cannot know what these messages are, as far as I
understand, since they are not present in the dump
(and, being unprocessed in the msg queue, certainly
haven't been written to disk).

Personally, I feel that this is a good illustration
of how things can go wrong when one relies on asynch
send as a way to not delay the working processes too
much. Sometimes they really do need to be held back,
and then, we don't have any means to do so.

I'm doubtful whether it is really a good idea to just
cast messages to the error_logger. If we do, perhaps
the error_logger should have a strategy for  throwing
stuff away if it gets severely backed up. After all,
by killing the entire system, the information is lost

Ulf W
Ulf Wiger
CTO, Erlang Training & Consulting Ltd

More information about the erlang-questions mailing list