[erlang-questions] ei_connect fails sometimes with EIO

Michael Santos michael.santos@REDACTED
Fri Nov 25 20:44:04 CET 2011


On Fri, Nov 25, 2011 at 04:43:23PM -0200, Andre Nathan wrote:
> Hello
> 
> I have a C node doing RPC on my Erlang cluster. From time to time I get
> Input/Output errors from ei_connect.
> 
> I've found this thread which reports something similar, but apparently
> no solution was found:
> 
>   http://erlang.org/pipermail/erlang-questions/2007-July/027711.html
> 
> Since EIO is a kind of "generic error" in erl_interface, I was wondering
> if there is anything else I can do to try to find what's happening.

You could try running epmd in debug mode (epmd -d -d -d), if you can make
the error happen predictably. Or use tcpdump to dump the packets to epmd
(port 4369) and analyze it later.

> One thing that I thought could be the cause of the problem is that I'm
> running the Erlang cluster with inet_dist_listen_min =
> inet_dist_listen_max = 9100; while I've had no problems with the Erlang
> cluster itself, I figured that maybe I was limiting its ability to
> handle concurrent connections by using a single port. However, changing
> inet_dist_listen_max to 9110 didn't solve the problem, so I guess it's
> something else.
> 
> Does anyone have any idea about this?
> 
> Thanks,
> Andre

I had a similar problem with erl_call when node names clashed. I was
running erl_call from Nagios to do some service checks, so a few nodes
might be running at any time. Using "unique" names fixed it for me:

    NODEID=$(($$ % 32))
    RES=$($ERL_CALL -h n-${NODEID} -c $COOKIE -sname $NODE -a "$MODULE q [$ARG]" 2>&1)

If the modulus in NODEID was too low, I'd see eio errors. Originally,
I just used the process id for the name but the erlang node would crash,
I think because the atom table overflowed.



More information about the erlang-questions mailing list