[erlang-questions] Intermittent failures connecting C hidden nodes

jm jeffm@REDACTED
Fri Jul 6 01:59:19 CEST 2007


Andy Sloane wrote:

> So it's obviously making the TCP connection, getting through send_name
> without any errors, and then recv_status is reading a response other
> than "sok" but, unhelpfully, doesn't tell me what the response
> actually was and just bombs with EIO.  This is what led me to believe
> beam thinks there's something wrong with the sname.
> 

I'd suggest reading someone else's suggests first :-), but if really get
stuck try,

* Can you add a net_adm:ping/1 to the c-node and get it to periodically
pole the server so you can see how often the problem occurs? This may
help you see correlations with other events(eg load, open sockets, etc).

* Is the problem isolated to erlang or other protocol affected?

* Double checking that the switches between the two machines are locks
at the correct speed, eg 100Mps full-duplex.

* That the switches aren't overload in someway. I have not had this
particular problem but have had cases where upgrading a switch did wonders.

* Check the interface stats on the switch and computers to see if there
are any errors CRC, frame errors, etc.

* Check MTU. Strange I know but has been known to cause problems. Mostly
on WAN links though.

* Check that subnets/netmask/broadcast addresses agree.

* Get Ethereal ( http://www.ethereal.com/ ) and mirror the ports of
interest to another port on the switch. Plug a laptop with ethereal on
it in and see what is actually on the network.

* Check and replace cables if your seeing errors on the interfaces.

I now this sounds like over kill and really basic stuff plus I don't
know how much access you have to you own servers and network, but if the
lower layer aren't working the higher layers can't.

As I said see what other people on the list have to say. The above is
more a set of random thought of general things to try.

Jeff.



More information about the erlang-questions mailing list