[erlang-questions] System limit bringing down rex and the VM

Thu Sep 9 12:15:38 CEST 2010

On 8 September 2010 23:34, <bile@REDACTED> wrote:

>
> How many other limits cause the platform to shit the bed? I suspect
> few. There is a massive difference between the entire platform
> collapsing and RPC not working / restarting. If spawning of processes
> is so fundamental why do the core processes fit into the process
> limit? Linux as mentioned before prevents userland from wrecking things
> by safeguarding some amount of RAM for itself. Couldn't BEAM do the
> same? Why doesn't it auto shutdown if the limit is hit? Why no warnings
> from the system? Why is it triggered by rex? A minor component of the
> base process tree.
>

As Ulf pointed out, beam does throw an exception. It is indeed the rex
process which decided to give up. What I meant to say was that almost all
applications written in erlang assume they are operating within the system
limits. In the case of rex, it decides to die if it can't spawn a process.
rex is only active if you are using distributed erlang. That is a design
decision, and it is a valid one - atleast for those of us who use it a lot
in real world situations.

> There is no reason to take control away from the developer. Especially
> when it means the entire platform will collapse from underneath them
> for something entirely controllable.
>
>
It is a trade off. You have complete control over what happens in the system
when you program in C. Doesn't necessarily mean that is the best choice.

>
> > The error message you see about mnesia_recover is the "effect", not
> > the "cause".
>
> The mnesia_recover rpc call is the catalyst for the failure. It's
> issuing the RPC command. The error is caused by the rpc failure.
>

You are wrong. Infact if you look closely at it, mnesia is trying to make an
rpc call when it was trying to dump core, which means it was already dying
at that point. You need to dig deeper to find the real cause.

Chandru