[erlang-questions] OTP 22.1 socket.erl somehow breaks message delivery or scheduler

Andreas Schultz andreas.schultz@REDACTED
Wed Oct 23 15:53:39 CEST 2019


After converting an application to socket.erl in OTP 22.1, the test suites
started to fail with random timeouts. It took me a while to figure out that
gen_server:calls arrived multiple seconds late in the server.

I have a demonstration at
https://gist.github.com/RoadRunnr/311a7679fff6fbdf367c455b960f1ba8. It
implements a simple UDP echo server with socket.erl. The client uses
gen_udp to send messages and wait for the response.
The client also sends Erlang ping message to the server and expects to get
a pong answer back. The socket.erl based server is supposed to not block
(and as far as I can tell, it does not), it therefore should be able to
answer the Erlang ping message all the time.
There are also some simple busy loop process running to get some load.
Without them the problem is not reproducible.

The sample is failing in about 20% off the test runs, when it does the
output is something like:

$ ~/stest.escript
Server Pid <0.78.0>
Server Addr #{addr => {127,0,0,1},family => inet,port => 38959}
ping timeout
           round trip    Clnt/Srvr    Srvr/Clnt     ProcPing
      85: ******** ns, ******** ns,    57675 ns,    42332 ns

The failure happens because a 'ping' message is not see in time by the
receive clause in the server process. It seems that either the process is
not scheduled for some time (multiple seconds), or the scanning of the
mailbox is missing the message.

I have ruled out that the UDP messages are being dropped, otherwise the
clients gen_udp:recv would never return.

Does anyone have a clue what might cause this? Or point out where my sample
is broken.

Many thanks


Andreas Schultz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20191023/0f67201e/attachment.htm>

More information about the erlang-questions mailing list