[erlang-questions] disconnected nodes
Ignas Vyšniauskas
baliulia@REDACTED
Sat Feb 15 12:57:56 CET 2014
Hi Ahmed,
On 10/22/2012 06:13 PM, Ahmed Omar wrote:
> Hi, We have a cluster of 20+ nodes running R14B04 on Linux Debian
> Squeeze. Suddenly we started having problems where 4-5 would drop out
> of the cluster and we would see this in the logs
>
> =ERROR REPORT==== 2012-10-18 10:49:26 === ** Node 'x@REDACTED' not
> responding ** ** Removing (timedout) connection **
>
> We made a crash dump of some of the nodes and found error_logger has
> a queue of these messages
>
> {notify,{error,noproc, {emulator,"~s~n",["erts_poll_wait() failed:
> ebadf (9)\n"]}}}
>
> Any hints? (other than changing net_kernel net_tick_time)
>
> Best Regards, Ahmed
Sorry to bump this ancient thread, but have you perhaps found a cause
for this?
We're seeing the same thing during overloads. We're running R15B03
though and additionally to EBADFs get EINVAL failures, i.e.:
erts_poll_wait() failed: einval
How you maybe figured out anything specific causing the EBADFs?
Thanks,
Ignas
More information about the erlang-questions
mailing list