[erlang-questions] disconnected nodes

Ahmed Omar spawn.think@REDACTED
Mon Oct 22 18:13:23 CEST 2012


Hi,
We have a cluster of 20+ nodes running R14B04 on Linux Debian Squeeze.
Suddenly we started having problems where 4-5 would drop out of the cluster
and we would see this in the logs

=ERROR REPORT==== 2012-10-18 10:49:26 ===
** Node 'x@REDACTED' not responding **
** Removing (timedout) connection **

We made a crash dump of some of the nodes and found error_logger has a
queue of these messages

{notify,{error,noproc,
               {emulator,"~s~n",["erts_poll_wait() failed: ebadf (9)\n"]}}}

Any hints? (other than changing net_kernel net_tick_time)

Best Regards,
Ahmed
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20121022/12257f82/attachment.htm>


More information about the erlang-questions mailing list