The missing UDP packets
Raimo Niskanen
raimo+erlang-questions@REDACTED
Fri Feb 14 14:33:45 CET 2020
On Fri, Feb 14, 2020 at 04:23:15PM +0300, Alexander Petrovsky wrote:
> Hello!
>
> We have tested this fix in the idle state when there is not so much traffic
> on the UDP socket and also under the load. In both cases, the problem has
> disappeared, so I guess we could say that the fix works correctly, thanks!
Good to know. Thank you for informing!
/ Raimo
>
> ср, 12 февр. 2020 г. в 17:13, Raimo Niskanen <
> raimo+erlang-questions@REDACTED>:
>
> > On Wed, Feb 12, 2020 at 02:31:10PM +0300, Alexander Petrovsky wrote:
> > > Awesome, thanks a lot :)
> > >
> > > I'll test this fix and in case of some problems, I will let you know!
> >
> > Great!
> >
> > I would apprecite if you could verify also if the fix works,
> > so I can merge it into ourt maint and master.
> >
> > / Raimo
> >
> >
> > >
> > > вт, 11 февр. 2020 г. в 21:01, Valentin Micic <v@REDACTED>:
> > >
> > > > Thank you Raimo, much obliged.
> > > > I am sure this will make Alexander very happy :-)
> > > >
> > > > V/
> > > >
> > > > > On 11 Feb 2020, at 17:21, Raimo Niskanen <
> > > > raimo+erlang-questions@REDACTED> wrote:
> > > > >
> > > > > On Mon, Feb 10, 2020 at 04:23:37PM +0100, Raimo Niskanen wrote:
> > > > >> I can reproduce it on a month old 'master'.
> > > > >> We'll take a look at it...
> > > > >>
> > > > >> Thank you for a pinpointed bug report!
> > > > >> / Raimo
> > > > >
> > > > > I have a patch that fixes the problem:
> > > > >
> > > > > @@ -12609,12 +12609,8 @@ static ErlDrvSSizeT
> > packet_inet_ctl(ErlDrvData
> > > > e,
> > > > > unsigned int cmd, char* buf,
> > > > > return ctl_error(EALREADY, rbuf, rsize);
> > > > >
> > > > > if (packet_inet_input(udesc, desc->event) == 0) {
> > > > > - if (timeout == 0)
> > > > > - async_error_am(desc, am_timeout);
> > > > > - else {
> > > > > - if (timeout != INET_INFINITY)
> > > > > - driver_set_timer(desc->port, timeout);
> > > > > - }
> > > > > + if (timeout != INET_INFINITY)
> > > > > + driver_set_timer(desc->port, timeout);
> > > > > }
> > > > > return ctl_reply(INET_REP_OK, tbuf, 2, rbuf, rsize);
> > > > > }
> > > > >
> > > > > It should be possible to apply to any version. The code lines I
> > removed
> > > > > have been around since we entered GitHub.
> > > > >
> > > > > The shortcut for timeout == 0 was flawed. Right before this an async
> > > > event
> > > > > was enqueued, which is not cleared when taking this short way out,
> > and
> > > > the
> > > > > packet_inet_input function had put the socket in the poll set, which
> > the
> > > > > shortcut also ignored. So when the UDP packet arrives it is
> > received and
> > > > > delivery to the caller fails since there is no longer any caller -
> > the
> > > > caller
> > > > > was already told {error, timeout}. The packet gets wasted.
> > > > >
> > > > > This fix removes the shortcut for timeout == 0. It would be
> > possible to
> > > > > instead fix the shortcut, which could improve latency slightly. With
> > > > this
> > > > > fix code sets a timer for timeout == 0, which immediately times out
> > > > > and then all internal state gets updated as it should. The same code
> > > > path
> > > > > is used for all timeouts including 0.
> > > > >
> > > > > Again, thank you for the bug report and pinpointing.
> > > > >
> > > > > I will now create an internal ticket, write a test case, and prepare
> > > > > for a patch... We would prefer to just include this fix in the
> > upcoming
> > > > > OTP 22.3.
> > > > >
> > > > > Best regards
> > > > > / Raimo Niskanen
> > > > >
> > > > >
> > > > >>
> > > > >>
> > > > >> On Fri, Feb 07, 2020 at 04:50:54PM +0300, Alexander Petrovsky wrote:
> > > > >>>>
> > > > >>>> From: Valentin Micic
> > > > >>>>
> > > > >>>> I’ve run a couple of tests on my own, and managed to reproduce
> > your
> > > > >>>> “second! and only second!!” packet problem.
> > > > >>>>
> > > > >>>> It happens under following circumstances:
> > > > >>>>
> > > > >>>> 1) Socket (Rx) is instructed to provide a single packet as it
> > arrives
> > > > ( as
> > > > >>>> per {active, once} );
> > > > >>>>
> > > > >>>>
> > > > >>>> 2) After receiving this packet, a call to gen_udp:recv( Rx, 0, 0
> > ) is
> > > > >>>> executed before another packet has ARRIVED to the kernel buffer;
> > > > >>>>
> > > > >>>> Then the following diagram illustrates the problem:
> > > > >>>>
> > > > >>>> Note:
> > > > >>>> - Tx is sending socket;
> > > > >>>> - Rx is receiving socket;
> > > > >>>>
> > > > >>>> *Please note that sequence diagrams below are not completely
> > > > accurate, but
> > > > >>>> it is reasonably illustrative of what may have happened.*
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>> The next sequence diagram depicts a situation when a non-zero
> > value
> > > > for
> > > > >>>> gen_udp:recv/3, e.g. gen_udp:recv( Rx, 0, *10* ):
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>> Thus, it appears that gen_udp:recv( *Socket, Length*, *Timeout* )
> > is
> > > > >>>> broken when *Timeout* is set to *0*.
> > > > >>>> For all positive non-zero values for Timeout, it behaves properly
> > (or
> > > > as
> > > > >>>> intuitively expected).
> > > > >>>>
> > > > >>>> Therefore, this should either be fixed, or it should be
> > documented in
> > > > >>>> order to indicate that 0 must *not* be used as a value for the
> > > > Timeout in
> > > > >>>> gen_udp:recv/3 call.
> > > > >>>>
> > > > >>>> Aside, I fail to see a benefit of using Timeout of value 0 when
> > > > combined
> > > > >>>> with {active, once}. The only time when zero value may appear to
> > be
> > > > useful
> > > > >>>> is when server does need to process other messages, e.g. messages
> > not
> > > > >>>> originated by the socket, which would be the case when gen_server
> > is
> > > > used
> > > > >>>> to support the socket. In my view (which is supported by
> > substantial
> > > > >>>> empirical evidence), gen_server is not the most optimal behaviour
> > to
> > > > use
> > > > >>>> when implementing support for sockets.
> > > > >>>>
> > > > >>>> Kind regards
> > > > >>>>
> > > > >>>> V/
> > > > >>>>
> > > > >>>>
> > > > >>> So, seems like there is some buggy behavior in the erlang VM for
> > > > reading
> > > > >>> UDP messages between switching from {active, once} to
> > > > gen_udp:recv(socket,
> > > > >>> 0, 0)? Lukas, Raimo could you help with that?
> > > > >>>
> > > > >>> --
> > > > >>> Петровский Александр / Alexander Petrovsky,
> > > > >>>
> > > > >>> Skype: askjuise
> > > > >>> Phone: +7 931 9877991
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >>
> > > > >> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > > > >
> > > > > --
> > > > >
> > > > > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > > >
> > > >
> > >
> > > --
> > > Alexander Petrovsky
> >
> > --
> >
> > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> >
>
>
> --
> Alexander Petrovsky
--
/ Raimo Niskanen, Erlang/OTP, Ericsson AB
More information about the erlang-questions
mailing list