[erlang-questions] System limit bringing down rex and the VM

Ulf Wiger ulf.wiger@REDACTED
Thu Sep 9 17:46:00 CEST 2010

On 09/09/2010 02:15 PM, Musumeci, Antonio S wrote:
> I understand your points completely... however, there is certainly a
> difference from having an erlang process die and allowing it's peers
> to handle the cleanup and having the erlang vm die. The vm has no
> peer in that way.

It does have a peer if you are running with redundancy, and
Erlang/OTP was primarily designed for systems where redundancy
is pretty much a given.

> Yes Mnesia needs RPC... but so do a lot of things
> and if the pattern is to be followed that you die and allow the peers
> to respond... that's not what is happening here. Rex dies and brings
> the world down with it. Mnesia is unable to respond to the issue. The
> mnesia code shows that it is prepared for {badrpc,_} errors.

Yes, but 'badrpc' means "I was not able to communicate with the
other node" - not "I wasn't even able to try". Mnesia is quite
prepared to handle the case that other nodes disappear.
In this case, a system resource has been exhausted, and as Chandru
pointed out, the crash came as mnesia was trying to create a
core dump, which means your system was going down anyway.

This is fairly typical if you exhaust a system limit. Even if you
could theoretically write code to handle it, most of the libraries
you likely want to use have not been designed that way, so something
is going to break.

If it were only rex, it is easy enough to write your own RPC
library that behaves differently. But mnesia is not prepared to cope
with not being able to spawn a process, or create an ETS table, which
is another system limit that can bring down mnesia.

You call it a completely arbitrary limit, but it is no more
arbitrary than the number of open file descriptors you may have.
You are not forced to accept the default limit, and just like with
the file descriptor limit, you probably couldn't for any system of
size. Try setting the process limit to a suitably high number.

Ulf W

More information about the erlang-questions mailing list