[erlang-questions] How to handle a massive amount of UDP packets?
Valentin Micic
v@REDACTED
Mon Apr 23 10:59:12 CEST 2012
First, why are we making a reference to TCP if subject line says we're discussing UDP? ;-)
Second, may I say why I think Ulf's approach better than Valentin's (hey, finally a namesake -- please to meet you):
Valentin's approach:
Whilst {active, 100} may reduce some overhead, it creates far more damage if one consider ease of programming. For example, how would one know when to issue another {active, N}? What's worse than "useess process switching" is having programmer doing the counting in order to be able to issue another {active, 100} at the end of the cycle. And what happens if one issues {active, 100} while another one is still running? Does this mean that one should expect additional 100 messages, or just 100?
OTOH, "useless process switching" notwithstanding, Ulf's approach offers a far simpler solution that does not require changes to driver -- if nothing else, far more practical approach.
Now, if I may be impractical, there's an alternative approach, which we've used for some high-throughput application operating on a raw socket - since raw IP is not natively supported by Erlang, we had to write a driver for it. Motivated by a need to increase a throughput, we derived a *at-most-N method*. Simply put, the semantic of {active, 100} syntax in this case would mean that driver shall send in one message a list of up to 100 packets. Assuming it would be simpler for a programmer to traverse a list, rather than counting variable-timing events.
This method combines relative simplicity suggested by Ulf, with lower overhead advocated by Valentin Nechayev.
Kind regards
V/
Or, what should happen if one issue another {active, 100} while the current one is still running?
Saying that Ulf's approach gives useless process switches is quite useless -- process switches is what computer does.
On 23 Apr 2012, at 8:25 AM, Valentin Nechayev wrote:
>> From: Ulf Wiger <ulf@REDACTED>
>
>> What one can do is to combine {active, once} with gen_tcp:recv().
>>
>> Essentially, you will be served the first message, then read as many as you
>> wish from the socket. When the socket is empty, you can again enable
>> {active, once}.
>
> First, the approach you described is quite badly documented. No
> description how such non-waiting recv() can be reached. If this is call
> with Timeout=0, type timeout() isn't defined, and return value for
> timeout isn't defined. It only defines Reason = closed or
> inet:posix(). But it's incorrect to guess that eagain (or ewouldblock?)
> will be returned, if the implementing code is uniform against timeout
> value except infinity. I dislike to use such undocumented ways.
>
> Second, your approach gives useless process switches. If a long message
> is in receiving via TCP, there will be two switches to owner or more -
> the first one for the first part of a message, and some next ones for
> rest of it. If incoming rate is enough to process each small portion
> (TCP window) separately, owner process will get and process them
> separately; if its and system speed isn't enough for such switching,
> data will group in larger portions. This means that performance
> measuring will be total lie, with three intervals - uselessly quick
> saturation, then stable 100% under wide load interval, and then
> unexpected overloading. It's very hard to diagnose and optimize a
> system with such behavior, and this trend to fill the whole system by
> one subsystem affects other concurrent subsystems in bad way.
>
> People invented many mechanisms of avoiding both uselessly fast
> switching and non-reasonable delays - see e.g. VMIN and VTIME in
> termios, low matermark in BSD sockets. The Max Lapshin's proposition is
> among them and should only get small but major extension - to specify
> both full limit and inter-portion timeout.
>
> Third, please see measures by John-Paul Bader in neighbour message:
> with {active,false} he gets substantial packet loss, compared to
> {active,true}. Yep, this is UDP specifics and nobody guaranteed the
> delivery but there is no reason to increase loss without reason. His
> result shall be checked against the real reason but I guess these are
> socket buffer overflows. With {active,true}, owner mailbox becomes
> additional socket buffer with much larger size, but owner process loses
> control on its mailbox. Having window of allowed packets, it can
> provide more fine tuning of its load.
>
>
> -netch-
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
More information about the erlang-questions
mailing list