[erlang-questions] Timeout in erl_call

Knut Nesheim knut.nesheim@REDACTED
Wed Oct 5 17:27:21 CEST 2011


Hello list,

We are seeing intermittent timeouts when using erl_call. The error
message we reiceve is "erl_call: unable to start node, error = -5",
which I traced down to meaning ERL_TIMEOUT when waiting for a response
from the node. It only happens on one of two machines, which are
identical in hardware and software.

We use erl_call to every 5 minutes retrieve system stats together with
fetching some statistical data from a process. There is no
computation, only fetching and organising the information. Whenever I
run this code manually it runs so fast that there is no noticable
delay. From time to time, like 5-10 times per day erl_call will
however crash with a timeout. We run erl_call and the node on the same
machine, but we are using fully qualified names.

We cannot find any problem in general with the machine, the erlang
node does not log anything and we are unable to reproduce the problem.
I do not believe the code being executed inside the node by erl_call
takes so long to return that it is causing the timeout. My only
"measurement" of this however is running it manually and seeing that
it executes without noticeable delay.

Does anyone have any idea of what the problem might be? Or an idea on
how to narrow it down?

We could use different means of communicating with the node (http, tcp
socket, etc), but I would like to understand what is going wrong here
before changing anything.

Knut
-- 
Engineering
http://www.wooga.com | phone +49 151 57202523 | fax +49-30-8964 9064

wooga GmbH | Saarbruecker Str. 38 | 10405 Berlin | Germany
Sitz der Gesellschaft: Berlin; HRB 117846 B
Registergericht Berlin-Charlottenburg
Geschaeftsfuehrung: Jens Begemann, Philipp Moeser



More information about the erlang-questions mailing list