The missing UDP packets
Raimo Niskanen
raimo+erlang-questions@REDACTED
Wed Feb 12 15:13:29 CET 2020
On Wed, Feb 12, 2020 at 02:31:10PM +0300, Alexander Petrovsky wrote:
> Awesome, thanks a lot :)
>
> I'll test this fix and in case of some problems, I will let you know!
Great!
I would apprecite if you could verify also if the fix works,
so I can merge it into ourt maint and master.
/ Raimo
>
> вт, 11 февр. 2020 г. в 21:01, Valentin Micic <v@REDACTED>:
>
> > Thank you Raimo, much obliged.
> > I am sure this will make Alexander very happy :-)
> >
> > V/
> >
> > > On 11 Feb 2020, at 17:21, Raimo Niskanen <
> > raimo+erlang-questions@REDACTED> wrote:
> > >
> > > On Mon, Feb 10, 2020 at 04:23:37PM +0100, Raimo Niskanen wrote:
> > >> I can reproduce it on a month old 'master'.
> > >> We'll take a look at it...
> > >>
> > >> Thank you for a pinpointed bug report!
> > >> / Raimo
> > >
> > > I have a patch that fixes the problem:
> > >
> > > @@ -12609,12 +12609,8 @@ static ErlDrvSSizeT packet_inet_ctl(ErlDrvData
> > e,
> > > unsigned int cmd, char* buf,
> > > return ctl_error(EALREADY, rbuf, rsize);
> > >
> > > if (packet_inet_input(udesc, desc->event) == 0) {
> > > - if (timeout == 0)
> > > - async_error_am(desc, am_timeout);
> > > - else {
> > > - if (timeout != INET_INFINITY)
> > > - driver_set_timer(desc->port, timeout);
> > > - }
> > > + if (timeout != INET_INFINITY)
> > > + driver_set_timer(desc->port, timeout);
> > > }
> > > return ctl_reply(INET_REP_OK, tbuf, 2, rbuf, rsize);
> > > }
> > >
> > > It should be possible to apply to any version. The code lines I removed
> > > have been around since we entered GitHub.
> > >
> > > The shortcut for timeout == 0 was flawed. Right before this an async
> > event
> > > was enqueued, which is not cleared when taking this short way out, and
> > the
> > > packet_inet_input function had put the socket in the poll set, which the
> > > shortcut also ignored. So when the UDP packet arrives it is received and
> > > delivery to the caller fails since there is no longer any caller - the
> > caller
> > > was already told {error, timeout}. The packet gets wasted.
> > >
> > > This fix removes the shortcut for timeout == 0. It would be possible to
> > > instead fix the shortcut, which could improve latency slightly. With
> > this
> > > fix code sets a timer for timeout == 0, which immediately times out
> > > and then all internal state gets updated as it should. The same code
> > path
> > > is used for all timeouts including 0.
> > >
> > > Again, thank you for the bug report and pinpointing.
> > >
> > > I will now create an internal ticket, write a test case, and prepare
> > > for a patch... We would prefer to just include this fix in the upcoming
> > > OTP 22.3.
> > >
> > > Best regards
> > > / Raimo Niskanen
> > >
> > >
> > >>
> > >>
> > >> On Fri, Feb 07, 2020 at 04:50:54PM +0300, Alexander Petrovsky wrote:
> > >>>>
> > >>>> From: Valentin Micic
> > >>>>
> > >>>> I’ve run a couple of tests on my own, and managed to reproduce your
> > >>>> “second! and only second!!” packet problem.
> > >>>>
> > >>>> It happens under following circumstances:
> > >>>>
> > >>>> 1) Socket (Rx) is instructed to provide a single packet as it arrives
> > ( as
> > >>>> per {active, once} );
> > >>>>
> > >>>>
> > >>>> 2) After receiving this packet, a call to gen_udp:recv( Rx, 0, 0 ) is
> > >>>> executed before another packet has ARRIVED to the kernel buffer;
> > >>>>
> > >>>> Then the following diagram illustrates the problem:
> > >>>>
> > >>>> Note:
> > >>>> - Tx is sending socket;
> > >>>> - Rx is receiving socket;
> > >>>>
> > >>>> *Please note that sequence diagrams below are not completely
> > accurate, but
> > >>>> it is reasonably illustrative of what may have happened.*
> > >>>>
> > >>>>
> > >>>>
> > >>>> The next sequence diagram depicts a situation when a non-zero value
> > for
> > >>>> gen_udp:recv/3, e.g. gen_udp:recv( Rx, 0, *10* ):
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> Thus, it appears that gen_udp:recv( *Socket, Length*, *Timeout* ) is
> > >>>> broken when *Timeout* is set to *0*.
> > >>>> For all positive non-zero values for Timeout, it behaves properly (or
> > as
> > >>>> intuitively expected).
> > >>>>
> > >>>> Therefore, this should either be fixed, or it should be documented in
> > >>>> order to indicate that 0 must *not* be used as a value for the
> > Timeout in
> > >>>> gen_udp:recv/3 call.
> > >>>>
> > >>>> Aside, I fail to see a benefit of using Timeout of value 0 when
> > combined
> > >>>> with {active, once}. The only time when zero value may appear to be
> > useful
> > >>>> is when server does need to process other messages, e.g. messages not
> > >>>> originated by the socket, which would be the case when gen_server is
> > used
> > >>>> to support the socket. In my view (which is supported by substantial
> > >>>> empirical evidence), gen_server is not the most optimal behaviour to
> > use
> > >>>> when implementing support for sockets.
> > >>>>
> > >>>> Kind regards
> > >>>>
> > >>>> V/
> > >>>>
> > >>>>
> > >>> So, seems like there is some buggy behavior in the erlang VM for
> > reading
> > >>> UDP messages between switching from {active, once} to
> > gen_udp:recv(socket,
> > >>> 0, 0)? Lukas, Raimo could you help with that?
> > >>>
> > >>> --
> > >>> Петровский Александр / Alexander Petrovsky,
> > >>>
> > >>> Skype: askjuise
> > >>> Phone: +7 931 9877991
> > >>
> > >>
> > >>
> > >>
> > >> --
> > >>
> > >> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > >
> > > --
> > >
> > > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> >
> >
>
> --
> Alexander Petrovsky
--
/ Raimo Niskanen, Erlang/OTP, Ericsson AB
More information about the erlang-questions
mailing list