[erlang-questions] Why Beam.smp crashes when memory is over?
Jayson Vantuyl
kagato@REDACTED
Sun Nov 8 22:26:43 CET 2009
Erlang needs to allocate memory in any number of situations. For
example, assume that Erlang tried to tell your code. Should it
generate a message? Should it call a function? Should it create an
exception? Guess what all of these have in common? They allocate
memory (which isn't there).
You can try to work around this. You can have reserved memory just
for this. However, there's still no clue on where it should happen.
There is not a very good chance that this error will happen in the
process that has all of the memory allocated. If you take the Linux
OOM approach, you would have to scan all of the processes, weigh them,
mix in some randomness, and message it. There's no memory for it do
think much about the problem. Even if you killed it, that would just
trigger the supervisor to restart it, even though we may not have
actually stopped the memory leak.
Worse, this means that an "out of memory error" can happen anywhere
and must be handled everywhere, even the supervisors. Patching the
supervisors to reliably handle this would be insane. Suddenly,
reliability under load becomes impossible to guarantee.
Even if you emulate Linux and provide an OOM-killer (i.e. kill
processes based on randomness + heuristics to detect runaway
processes), you introduce tons of random behavior into the VM, when a
VM restart would be recognizable, loggable, and generally easier to
debug.
Exposing those errors creates an ugly situation. This extra error
handling would cause an explosion of corner cases, decreases in
reliability, and volumes of code (i.e. where bugs live). Inside of
Erlang, the philosophy is to use supervisors and writing daemons to be
able to recover from a restart. Heartbeat gives the same behavior for
the entire VM. It's a philosophical design choice to try to handle
critical faults rather than mask critical faults. It's really better
than trying to handle this.
It's seems obvious that there should be a better way to handle OOM,
but it's is all devilishly difficult to do in any meaningfully
portable (or useful) way.
On Nov 8, 2009, at 1:02 PM, Max Lapshin wrote:
> On Sun, Nov 8, 2009 at 11:57 PM, Jayson Vantuyl <kagato@REDACTED>
> wrote:
>> From within Erlang, I don't believe so.
>
> And what are the problems? OS never crashes when memory is over, OOM
> killer does the job well.
> Why should die Erlang VM?
>
> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org
>
--
Jayson Vantuyl
kagato@REDACTED
More information about the erlang-questions
mailing list