The missing UDP packets
Raimo Niskanen
raimo+erlang-questions@REDACTED
Tue Feb 11 16:21:07 CET 2020
On Mon, Feb 10, 2020 at 04:23:37PM +0100, Raimo Niskanen wrote:
> I can reproduce it on a month old 'master'.
> We'll take a look at it...
>
> Thank you for a pinpointed bug report!
> / Raimo
I have a patch that fixes the problem:
@@ -12609,12 +12609,8 @@ static ErlDrvSSizeT packet_inet_ctl(ErlDrvData e,
unsigned int cmd, char* buf,
return ctl_error(EALREADY, rbuf, rsize);
if (packet_inet_input(udesc, desc->event) == 0) {
- if (timeout == 0)
- async_error_am(desc, am_timeout);
- else {
- if (timeout != INET_INFINITY)
- driver_set_timer(desc->port, timeout);
- }
+ if (timeout != INET_INFINITY)
+ driver_set_timer(desc->port, timeout);
}
return ctl_reply(INET_REP_OK, tbuf, 2, rbuf, rsize);
}
It should be possible to apply to any version. The code lines I removed
have been around since we entered GitHub.
The shortcut for timeout == 0 was flawed. Right before this an async event
was enqueued, which is not cleared when taking this short way out, and the
packet_inet_input function had put the socket in the poll set, which the
shortcut also ignored. So when the UDP packet arrives it is received and
delivery to the caller fails since there is no longer any caller - the caller
was already told {error, timeout}. The packet gets wasted.
This fix removes the shortcut for timeout == 0. It would be possible to
instead fix the shortcut, which could improve latency slightly. With this
fix code sets a timer for timeout == 0, which immediately times out
and then all internal state gets updated as it should. The same code path
is used for all timeouts including 0.
Again, thank you for the bug report and pinpointing.
I will now create an internal ticket, write a test case, and prepare
for a patch... We would prefer to just include this fix in the upcoming
OTP 22.3.
Best regards
/ Raimo Niskanen
>
>
> On Fri, Feb 07, 2020 at 04:50:54PM +0300, Alexander Petrovsky wrote:
> > >
> > > From: Valentin Micic
> > >
> > > I’ve run a couple of tests on my own, and managed to reproduce your
> > > “second! and only second!!” packet problem.
> > >
> > > It happens under following circumstances:
> > >
> > > 1) Socket (Rx) is instructed to provide a single packet as it arrives ( as
> > > per {active, once} );
> > >
> > >
> > > 2) After receiving this packet, a call to gen_udp:recv( Rx, 0, 0 ) is
> > > executed before another packet has ARRIVED to the kernel buffer;
> > >
> > > Then the following diagram illustrates the problem:
> > >
> > > Note:
> > > - Tx is sending socket;
> > > - Rx is receiving socket;
> > >
> > > *Please note that sequence diagrams below are not completely accurate, but
> > > it is reasonably illustrative of what may have happened.*
> > >
> > >
> > >
> > > The next sequence diagram depicts a situation when a non-zero value for
> > > gen_udp:recv/3, e.g. gen_udp:recv( Rx, 0, *10* ):
> > >
> > >
> > >
> > >
> > > Thus, it appears that gen_udp:recv( *Socket, Length*, *Timeout* ) is
> > > broken when *Timeout* is set to *0*.
> > > For all positive non-zero values for Timeout, it behaves properly (or as
> > > intuitively expected).
> > >
> > > Therefore, this should either be fixed, or it should be documented in
> > > order to indicate that 0 must *not* be used as a value for the Timeout in
> > > gen_udp:recv/3 call.
> > >
> > > Aside, I fail to see a benefit of using Timeout of value 0 when combined
> > > with {active, once}. The only time when zero value may appear to be useful
> > > is when server does need to process other messages, e.g. messages not
> > > originated by the socket, which would be the case when gen_server is used
> > > to support the socket. In my view (which is supported by substantial
> > > empirical evidence), gen_server is not the most optimal behaviour to use
> > > when implementing support for sockets.
> > >
> > > Kind regards
> > >
> > > V/
> > >
> > >
> > So, seems like there is some buggy behavior in the erlang VM for reading
> > UDP messages between switching from {active, once} to gen_udp:recv(socket,
> > 0, 0)? Lukas, Raimo could you help with that?
> >
> > --
> > Петровский Александр / Alexander Petrovsky,
> >
> > Skype: askjuise
> > Phone: +7 931 9877991
>
>
>
>
> --
>
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
--
/ Raimo Niskanen, Erlang/OTP, Ericsson AB
More information about the erlang-questions
mailing list