[erlang-questions] VM silently exits

Serge Aleynikov serge@REDACTED
Thu Jul 29 05:38:04 CEST 2010


Steve,

I've seen something like this that was memory related.  Adjusting 
'fullsweep_after' option or setting ERL_FULLSWEEP_AFTER environment 
variable (http://www.erlang.org/doc/man/erlang.html#system_flag-2) 
helped in our case.

Perhaps you should set up a separate watchdog OS process that would 
monitor memory consumption of VM by reading the /proc/PID/stat and 
logging it every N seconds.

Serge

On 7/28/2010 4:29 PM, Steve Vinoski wrote:
> With R13B04 running on Montavista Linux, I've seen a few cases
> recently where the Erlang VM simply exits without any log messages,
> crash dumps, or coredumps. It appears to happen only after days of
> running under load, making it hard to reproduce and investigate. It
> may be related to memory consumption, but I'm not sure. External
> programs like heart and memsup simply report "Erlang has closed,"
> which according to the source code means they each got a return value
> of 0 from read() on their connection to the VM, which in turn would
> seem to indicate that the VM side of the connection was simply closed.
> This would all seem to indicate that the VM was either killed by
> another process or that it called exit() itself. The Linux "OOM
> killer" is disabled on this system, and I don't know of any other
> process that would be killing the VM. There are no alarms in the logs
> about hitting memory high watermarks or anything like that, and we
> aren't using any options to change allocators or anything like that.
>
> Anybody ever seen anything like this?
>
> I've found a few places in the VM C source code where exit() is called
> without logging anything. Some of these are normal exits, like when
> you exit an Erlang shell, where no logging is needed. But others seem
> to be error conditions, and there should be logging for those. I think
> I'll probably have to patch my system to add logging to those cases to
> try to track down this problem -- is there still time to get a patch
> like this into R14B? If this issue is memory-related, I suppose it's
> possible that a sudden increase in memory consumption could cause the
> VM to exit between alarm checks, explaining why things like memsup
> don't seem to notice, so it would seem to be fairly critical that
> something is logged by the VM itself for such cases.
>
> thanks,
> --steve
>
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>


More information about the erlang-questions mailing list