The missing UDP packets

Alexander Petrovsky askjuise@REDACTED
Fri Feb 14 14:23:15 CET 2020


Hello!

We have tested this fix in the idle state when there is not so much traffic
on the UDP socket and also under the load. In both cases, the problem has
disappeared, so I guess we could say that the fix works correctly, thanks!

ср, 12 февр. 2020 г. в 17:13, Raimo Niskanen <
raimo+erlang-questions@REDACTED>:

> On Wed, Feb 12, 2020 at 02:31:10PM +0300, Alexander Petrovsky wrote:
> > Awesome, thanks a lot :)
> >
> > I'll test this fix and in case of some problems, I will let you know!
>
> Great!
>
> I would apprecite if you could verify also if the fix works,
> so I can merge it into ourt maint and master.
>
> / Raimo
>
>
> >
> > вт, 11 февр. 2020 г. в 21:01, Valentin Micic <v@REDACTED>:
> >
> > > Thank you Raimo, much obliged.
> > > I am sure this will make Alexander very happy :-)
> > >
> > > V/
> > >
> > > > On 11 Feb 2020, at 17:21, Raimo Niskanen <
> > > raimo+erlang-questions@REDACTED> wrote:
> > > >
> > > > On Mon, Feb 10, 2020 at 04:23:37PM +0100, Raimo Niskanen wrote:
> > > >> I can reproduce it on a month old 'master'.
> > > >> We'll take a look at it...
> > > >>
> > > >> Thank you for a pinpointed bug report!
> > > >> / Raimo
> > > >
> > > > I have a patch that fixes the problem:
> > > >
> > > > @@ -12609,12 +12609,8 @@ static ErlDrvSSizeT
> packet_inet_ctl(ErlDrvData
> > > e,
> > > > unsigned int cmd, char* buf,
> > > >               return ctl_error(EALREADY, rbuf, rsize);
> > > >
> > > >           if (packet_inet_input(udesc, desc->event) == 0) {
> > > > -             if (timeout == 0)
> > > > -                 async_error_am(desc, am_timeout);
> > > > -             else {
> > > > -                 if (timeout != INET_INFINITY)
> > > > -                     driver_set_timer(desc->port, timeout);
> > > > -             }
> > > > +                if (timeout != INET_INFINITY)
> > > > +                    driver_set_timer(desc->port, timeout);
> > > >           }
> > > >           return ctl_reply(INET_REP_OK, tbuf, 2, rbuf, rsize);
> > > >       }
> > > >
> > > > It should be possible to apply to any version.  The code lines I
> removed
> > > > have been around since we entered GitHub.
> > > >
> > > > The shortcut for timeout == 0 was flawed.  Right before this an async
> > > event
> > > > was enqueued, which is not cleared when taking this short way out,
> and
> > > the
> > > > packet_inet_input function had put the socket in the poll set, which
> the
> > > > shortcut also ignored.  So when the UDP packet arrives it is
> received and
> > > > delivery to the caller fails since there is no longer any caller -
> the
> > > caller
> > > > was already told {error, timeout}.  The packet gets wasted.
> > > >
> > > > This fix removes the shortcut for timeout == 0.  It would be
> possible to
> > > > instead fix the shortcut, which could improve latency slightly.  With
> > > this
> > > > fix code sets a timer for timeout == 0, which immediately times out
> > > > and then all internal state gets updated as it should.  The same code
> > > path
> > > > is used for all timeouts including 0.
> > > >
> > > > Again, thank you for the bug report and pinpointing.
> > > >
> > > > I will now create an internal ticket, write a test case, and prepare
> > > > for a patch...  We would prefer to just include this fix in the
> upcoming
> > > > OTP 22.3.
> > > >
> > > > Best regards
> > > > / Raimo Niskanen
> > > >
> > > >
> > > >>
> > > >>
> > > >> On Fri, Feb 07, 2020 at 04:50:54PM +0300, Alexander Petrovsky wrote:
> > > >>>>
> > > >>>> From: Valentin Micic
> > > >>>>
> > > >>>> I’ve run a couple of tests on my own, and managed to reproduce
> your
> > > >>>> “second! and only second!!” packet problem.
> > > >>>>
> > > >>>> It happens under following circumstances:
> > > >>>>
> > > >>>> 1) Socket (Rx) is instructed to provide a single packet as it
> arrives
> > > ( as
> > > >>>> per {active, once} );
> > > >>>>
> > > >>>>
> > > >>>> 2) After receiving this packet, a call to gen_udp:recv( Rx, 0, 0
> ) is
> > > >>>> executed before another packet has ARRIVED to the kernel buffer;
> > > >>>>
> > > >>>> Then the following diagram illustrates the problem:
> > > >>>>
> > > >>>> Note:
> > > >>>>    - Tx is sending socket;
> > > >>>>    - Rx is receiving socket;
> > > >>>>
> > > >>>> *Please note that sequence diagrams below are not completely
> > > accurate, but
> > > >>>> it is reasonably illustrative of what may have happened.*
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> The next sequence diagram depicts a situation when a non-zero
> value
> > > for
> > > >>>> gen_udp:recv/3, e.g. gen_udp:recv( Rx, 0, *10* ):
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> Thus, it appears that gen_udp:recv( *Socket, Length*, *Timeout* )
> is
> > > >>>> broken when *Timeout* is set to *0*.
> > > >>>> For all positive non-zero values for Timeout, it behaves properly
> (or
> > > as
> > > >>>> intuitively expected).
> > > >>>>
> > > >>>> Therefore, this should either be fixed, or it should be
> documented in
> > > >>>> order to indicate that 0 must *not* be used as a value for the
> > > Timeout in
> > > >>>> gen_udp:recv/3 call.
> > > >>>>
> > > >>>> Aside, I fail to see a benefit of using Timeout of value 0 when
> > > combined
> > > >>>> with {active, once}. The only time when zero value may appear to
> be
> > > useful
> > > >>>> is when server does need to process other messages, e.g. messages
> not
> > > >>>> originated by the socket, which would be the case when gen_server
> is
> > > used
> > > >>>> to support the socket. In my view (which is supported by
> substantial
> > > >>>> empirical evidence), gen_server is not the most optimal behaviour
> to
> > > use
> > > >>>> when implementing support for sockets.
> > > >>>>
> > > >>>> Kind regards
> > > >>>>
> > > >>>> V/
> > > >>>>
> > > >>>>
> > > >>> So, seems like there is some buggy behavior in the erlang VM for
> > > reading
> > > >>> UDP messages between switching from {active, once} to
> > > gen_udp:recv(socket,
> > > >>> 0, 0)? Lukas, Raimo could you help with that?
> > > >>>
> > > >>> --
> > > >>> Петровский Александр / Alexander Petrovsky,
> > > >>>
> > > >>> Skype: askjuise
> > > >>> Phone: +7 931 9877991
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >>
> > > >> / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > > >
> > > > --
> > > >
> > > > / Raimo Niskanen, Erlang/OTP, Ericsson AB
> > >
> > >
> >
> > --
> > Alexander Petrovsky
>
> --
>
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>


-- 
Alexander Petrovsky
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20200214/1b124f79/attachment.htm>


More information about the erlang-questions mailing list