[erlang-questions] System limit bringing down rex and the VM

Musumeci, Antonio S Antonio.Musumeci@REDACTED
Wed Sep 8 18:22:31 CEST 2010


OS limits? This is the BEAM process limit. Rex doesn't gracefully handle errors which in this case is triggered by mnesia_recover. I don't see the point in rex not handling it more gracefully either by returning the error to the caller or restarting. Rex may be a base process but it's not that important IMO to cause the VM to shutdown. I'd rather not have to implement OOM killer like behavior or some other watchdog process just so my VM doesn't crash due to hitting otherwise arbitrary max process limits.

-----Original Message-----
From: Igor Ribeiro Sucupira [mailto:igorrs@REDACTED] 
Sent: Wednesday, September 08, 2010 11:29 AM
To: Musumeci, Antonio S (Enterprise Infrastructure)
Cc: erlang-questions@REDACTED
Subject: Re: [erlang-questions] System limit bringing down rex and the VM

I know this is obvious, but the first step should be adjusting your OS limits so that your system, under expected conditions, never fails because of them.
I have seen a couple of funny behaviours when exhausting system limits in the QA environment, so I wouldn't count with gracious failures from Erlang/Mnesia/etc.

Good luck.
Igor.

On Wed, Sep 8, 2010 at 10:35 AM, Musumeci, Antonio S <Antonio.Musumeci@REDACTED> wrote:
> Is this an expected behavior? Shouldn't there be a way to more graciously handle this type of thing?
>
> =ERROR REPORT==== 8-Sep-2010::09:28:26 ===
> ** Generic server rex terminating
> ** Last message in was {call,mnesia_lib,get_node_number,[],<0.55.0>}
> ** When Server state == {0,nil}
> ** Reason for termination ==
> ** {system_limit,[{erlang,spawn_opt,
>                          [{erlang,apply,
>                                   [#Fun<rpc.1.93176247>,[]],
>                                   [monitor]}]},
>                  {erlang,spawn_monitor,1},
>                  {rpc,handle_call_call,6},
>                  {gen_server,handle_msg,5},
>                  {proc_lib,init_p_do_apply,3}]}
>
> =CRASH REPORT==== 8-Sep-2010::09:28:26 ===
>  crasher:
>    initial call: rpc:init/1
>    pid: <0.12.0>
>    registered_name: rex
>    exception exit: {system_limit,
>                        [{erlang,spawn_opt,
>                             [{erlang,apply,
>                                  [#Fun<rpc.1.93176247>,[]],
>                                  [monitor]}]},
>                         {erlang,spawn_monitor,1},
>                         {rpc,handle_call_call,6},
>                         {gen_server,handle_msg,5},
>                         {proc_lib,init_p_do_apply,3}]}
>      in function  gen_server:terminate/6
>    ancestors: [kernel_sup,<0.10.0>]
>    messages: []
>    links: [<0.11.0>]
>    dictionary: []
>    trap_exit: true
>    status: running
>    heap_size: 377
>    stack_size: 24
>    reductions: 143
>  neighbours:
>
> =SUPERVISOR REPORT==== 8-Sep-2010::09:28:26 ===
>     Supervisor: {local,kernel_sup}
>     Context:    child_terminated
>     Reason:     {system_limit,
>                     [{erlang,spawn_opt,
>                          [{erlang,apply,
>                               [#Fun<rpc.1.93176247>,[]],
>                               [monitor]}]},
>                      {erlang,spawn_monitor,1},
>                      {rpc,handle_call_call,6},
>                      {gen_server,handle_msg,5},
>                      {proc_lib,init_p_do_apply,3}]}
>     Offender:   [{pid,<0.12.0>},
>                  {name,rex},
>                  {mfargs,{rpc,start_link,[]}},
>                  {restart_type,permanent},
>                  {shutdown,2000},
>                  {child_type,worker}]
>
>
> =SUPERVISOR REPORT==== 8-Sep-2010::09:28:26 ===
>     Supervisor: {local,kernel_sup}
>     Context:    shutdown
>     Reason:     reached_max_restart_intensity
>     Offender:   [{pid,<0.12.0>},
>                  {name,rex},
>                  {mfargs,{rpc,start_link,[]}},
>                  {restart_type,permanent},
>                  {shutdown,2000},
>                  {child_type,worker}]
>
>
> =ERROR REPORT==== 8-Sep-2010::09:28:26 ===
> Mnesia(igo@REDACTED<mailto:igo@REDACTED>): ** ERROR ** mnesia_recover got 
> unexpected info: {'EXIT',
>                                                                   
> <0.84.0>,
>                                                                   
> shutdown}

--
"The secret of joy in work is contained in one word - excellence. To know how to do something well is to enjoy it." - Pearl S. Buck.



More information about the erlang-questions mailing list