[erlang-questions] Fail fast

Fri Dec 23 03:03:20 CET 2011

If there was a proper supervision hierarchy all the way up to the "root" of
the application, why would this happen? Wouldn't the supervisors just kill
off whatever process ends up not being able to allocate memory, and thus
clean up? (Perhaps kicking off users at the same time) If it fails far
enough up, wouldn't it basically reset the erl environment to "scratch" ?
Or would that be expecting too much  from the supervision hierarchy?

Sincerely,

jw

--
Americans might object: there is no way we would sacrifice our living
standards for the benefit of people in the rest of the world. Nevertheless,
whether we get there willingly or not, we shall soon have lower consumption
rates, because our present rates are unsustainable.

On Tue, Dec 20, 2011 at 6:23 PM, Chandru <
chandrashekhar.mullaparthi@REDACTED> wrote:

> Hello everyone,
>
> I've just had a failure in one of my live services because an erlang node
> ran out of memory (caused by a traffic spike). Restart mechanisms exist to
> restart the node, but the node took a long time to die because it was
> writing a large erl_crash.dump file, and then there was a 7GB core dump.
>
> Is there a quicker way to fail? I'm thinking of disabling core dumps
> entirely on the box. What else can I do? A configuration option on the node
> to only produce a summary erl_crash.dump would be nice. The most useful
> things for me in a crash dump usually are the slogan at the top, and the
> message queue lengths of each process. In this particular case, the slogan
> would've told me all that I needed to know.
>
> Chandru
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20111222/d543b947/attachment.htm>