[erlang-questions] Why Beam.smp crashes when memory is over?

Joe Armstrong erlang@REDACTED
Tue Nov 10 13:45:17 CET 2009


This is a very interesting problem. if processes have quotas, then how
could you set the
quota value?

 A perfectly correct process might just have a very deep stack, just once in its
life and otherwise be fine. Whether to crash this process or not would
depend upon
what the other processes in the system happened to be doing at the
time. This would
be very unfortunate - it's like your program being hit by a cosmic ray
- nasty. It creates a random non-deterministic coupling between things
that are supposed to be independent.

A possibility that just occurred to me might be to suspend processes
that appear to be
running wild until such a time as the overall memory situation looks good.

Image two scheduler queues. One for well behaved programs. One for
programs whose
stacks and heaps are growing too fast. If memory is no problem we run
programs from both queues. If memory is tight we run processes in the
"problem" queue less often and with frequent garbs.

Killing a program with a large stack and heap, just because their
happens to be a
temporary memory problem seems horrible, especially since the problem
might go away
if we wait a few milliseconds.

Suspending a memory hungry process for a while, until memory is available seems
less objectionable. Perhapse it could be swapped out to disk and
pulled in a lot later.
Killing things at random in the hope it might help sounds like a
really bad idea.
Process migration could solve this - move it to a machine that has got
more memory.

Suspending things seems ok - you might even suspend an errant process forever
and reclaim the memory - but not kill it. Some other process could
detect that the processes
is not responding and kill it and thus all the semantics of the
application would be obeyed
(processes are allowed to be unresponsive, that's fine) and the semantics of the
error recovery should say what to do in this case.

Just killing processes when they have done nothing wrong is not a good idea.

/Joe



On Tue, Nov 10, 2009 at 1:10 PM, Ulf Wiger
<ulf.wiger@REDACTED> wrote:
> Richard O'Keefe wrote:
>>>
>>> One way would be to let the user set a memory quota on a process with
>>> options at spawn time. When the process reaches it quota it can be
>>> automatically killed or the user can
>>> be notified in some way and take actions.
>>
>> One of the reasons this hasn't been done is, I presume, the fact that
>> it is quite difficult for a programmer to determine what the memory
>> quota should be.  It depends on
>>  ...
>
> I implemented resource limits in erlhive - at the Erlang level rather
> than in the VM. The purpose was to be able to run foreign code safely
> in a hosted environment. Eliminating the possibility to do damage
> through traditional side-effects was relatively easy with a code
> transform, but two ways of staging a DoS attack would be to gobble
> RAM or CPU capacity. I approached this by inserting calls to a check
> function that sampled heap size, and started a "watchdog" process that
> would unceremoniously kill the program after a certain time.
>
> In short, I can see a need for such limits, and would like to include
> a reduction ceiling. The limits could be set after careful testing
> and high enough that they protect against runaway processes. A reduction
> limit could be checked at the end of each slice, perhaps.
>
> In my experience, per-process memory usage is fairly predictable in
> erlang. Does anyone have a different experience?
>
> BR,
> Ulf W
> --
> Ulf Wiger
> CTO, Erlang Training & Consulting Ltd
> http://www.erlang-consulting.com
>
> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org
>
>


More information about the erlang-questions mailing list