[erlang-questions] all nodes in cluster crashing with eheap_alloc in the same time
Lukas Larsson
lukas@REDACTED
Mon Sep 26 10:55:35 CEST 2016
Hello,
On Wed, Sep 21, 2016 at 7:50 PM, Caragea Silviu <silviu.cpp@REDACTED>
wrote:
>
> The only question I have now is :
>
> How I can make something to include in the logs more other info before
> process dies. like number of messages in the queue.
>
> We tried to setup also a monitor to be triggered way less than the limit
> where it has to be killed:
>
> Options = [{long_gc, 10000}, {large_heap, 1000000}, busy_port,
> busy_dist_port],
> erlang:system_monitor(self(), Options),
>
> handle_info({monitor, Pid, Type, Details}, State) ->
> log_system_event({Type, Pid, Details}),
> {noreply, State};
>
> log_system_event({large_heap, GcPid, Info}) ->
> LogFun = fun() ->
> case recon:info(GcPid, messages) of
> {messages, Messages} ->
> ?WARNING_MSG("Large heap (~p): ~p~nProcess info: ~p~nProcess
> state size (words in the heap): ~p~nMessage queue(first 10):~p~n",
> [GcPid, Info, recon:info(GcPid), erts_debug:size(recon:get_state(GcPid)),
> Messages]);
> undefined ->
> ?WARNING_MSG("Large heap (~p): ~p~nProcess info is not available",
> [GcPid, Info])
> end
> end,
> spawn(LogFun);
>
> But unfortunately the processes that has this issues have a life time
> small than 4 seconds. And this event is never triggered in time.
>
> Any help is appreciated !
>
You could try to use max_heap_size with #{ kill => false } and then install
a specialized error_logger that listens to specifically that type of event
and retrieves the information you want before killing the process.
Depending on how fast you need the run-away process to be killed this may
be acceptable to you.
Another tip, you may want to configure the process to keep the message
queue data off_heap for processes that tend to build large message queues.
It will make the GC a lot happiers, but it will also make max_heap_size not
include the message queue size when doing it's analysis.
Lukas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20160926/6b15e264/attachment.htm>
More information about the erlang-questions
mailing list