Overlong wait in receive after
Robert Raschke
rtrlists@REDACTED
Fri Feb 19 18:33:36 CET 2010
Hi,
has anyone encountered spurious overlong waits using receive ... after? That
is, the code says "after 1000 ->" but it's way longer than 1 second before
it gets there.
My code starts a Java jinterface node, the Java code creates the node,
creates an mbox, and waits a minute for a message to appear and sends one
back.
The Erlang code starts the Java as a port program, and starts sending ok
messages to the registered mbox of the Java program spaced 1 second apart.
My logging indicates that the sending of the messages is actually spaced
somewhere between 5 and 12 seconds apart!
This is on a Windows 2003 Server running R12B-5.
In Erlang (started with -sname "erl@REDACTED") this is roughly what's
happening (apologies for typos):
run() ->
Node_Name = "foo@REDACTED",
Mbox = {box, list_to_atom(Node_Name)},
Port_Pid = spawn_link(?MODULE, run_port_program, [Node_Name,
atom_to_list(erlang:get_cookie())]),
{ok, Java_Pid} = shake_hands(Mbox, 0).
shake_hands(Mbox, N) when N < 50 ->
error_logger:info_report([{module, ?MODULE}, {handshake, N}, {mbox,
Mbox}]),
Mbox ! {self(), ok},
receive
{ok, Pid} ->
{ok, Pid}
after 1000 ->
error_logger:error_report([{module, ?MODULE}, {'handshake timeout',
N}]),
shake_hands(Mbox, N+1)
end;
shake_hands(_Mbox, N) when N >= 50 ->
timeout.
run_port_program(Node_Name, Cookie) ->
Cmd = lists:flatten(["java MyCode \"", Node_Name, "\" \"", Cookie,
"\""]),
P = open_port({spawn, Cmd}, [stream, {line, 100}, exit_status]),
loop(P, Node_Name, [], []).
loop(Port, Name, Response, Line) ->
%% handle the messages coming from Java's stdout ...
And the Java bit is doing:
String node_name = args[0];
String cookie = args[1];
OtpNode node = new OtpNode(node_name, cookie);
OtpMbox mbox = node.createMbox("box");
OtpErlangTuple msg = (OtpErlangTuple) mbox.receive(60000);
And my log contains entries where the time between the 'handshake' and the
'handhsake timeout' is somewhere between 5 and 12 seconds! Most of the times
the Java mbox.receive(60000) ends up timing out, after which, suddenly, the
log messages end up being 1 second spaced!
All extremely puzzling. To me it looks like something on the machine is
interfering with low-level comms. But that is currently just a hunch.
Any pointers or ideas on where to go hunting are greatly appreciated. I'll
probably try to create a standalone setup next, to see if I can find some
minimal code to exhibit this behaviour.
Thanks,
Robby
More information about the erlang-questions
mailing list