[erlang-questions] Timeout in erl_call
Wed Oct 5 17:27:21 CEST 2011
We are seeing intermittent timeouts when using erl_call. The error
message we reiceve is "erl_call: unable to start node, error = -5",
which I traced down to meaning ERL_TIMEOUT when waiting for a response
from the node. It only happens on one of two machines, which are
identical in hardware and software.
We use erl_call to every 5 minutes retrieve system stats together with
fetching some statistical data from a process. There is no
computation, only fetching and organising the information. Whenever I
run this code manually it runs so fast that there is no noticable
delay. From time to time, like 5-10 times per day erl_call will
however crash with a timeout. We run erl_call and the node on the same
machine, but we are using fully qualified names.
We cannot find any problem in general with the machine, the erlang
node does not log anything and we are unable to reproduce the problem.
I do not believe the code being executed inside the node by erl_call
takes so long to return that it is causing the timeout. My only
"measurement" of this however is running it manually and seeing that
it executes without noticeable delay.
Does anyone have any idea of what the problem might be? Or an idea on
how to narrow it down?
We could use different means of communicating with the node (http, tcp
socket, etc), but I would like to understand what is going wrong here
before changing anything.
http://www.wooga.com | phone +49 151 57202523 | fax +49-30-8964 9064
wooga GmbH | Saarbruecker Str. 38 | 10405 Berlin | Germany
Sitz der Gesellschaft: Berlin; HRB 117846 B
Geschaeftsfuehrung: Jens Begemann, Philipp Moeser
More information about the erlang-questions