[erlang-questions] running without net tick

Jayson Vantuyl kagato@REDACTED
Fri Sep 25 20:52:41 CEST 2009


I completely agree on the oil-and-water statement.  That said, TCP  
supports OOB, it's just a bad idea to use it.

The theory of "why it's not working" that was mentioned was that the  
other data multiplexed over the stream was choking out the ticks.

A dedicated connection makes that a non-issue.  It's the classic head- 
of-line blocking problem.

Let's say that ticks were reduced to a 4-byte timestamp (to give some  
reference point if the connection is broken and re-established).   
Let's say you send them every 10 seconds.  	TCP has a 40-byte  
overhead.  Ethernet usually has a 1500-byte MTU.  That makes room for  
an hour's worth of ticks in a single TCP packet over average ethernet  
(and probably at least 20-minutes worth over any useable MTU).

If we are limited to IP, want to have any chance to make it through a  
firewall, want timely retries, and want a tick to generate one-packet  
(or less aggregate), we either have UDP or TCP.  In the above case,  
TCP generates about as much packet traffic as UDP and is reasonably  
close to timely.  The packets are larger and the retries are wasted,  
but they also back off exponentially.  A dedicated connection does not  
have the "head-of-line" problem due to multiplexing (which is,  
admittedly, unproven as the problem).

The point of this exercise is that, unless you're going over a very  
small or very latent pipe, UDP doesn't really give us anything other  
than more work.  Why more work, you say?  Because UDP doesn't retry.   
Sending a tick every 10 seconds over TCP is not the same thing as  
sending a tick every 10 seconds over UDP.  Why?  Assume 75%, random  
packet loss.  That means you're likely to get a single UDP tick every  
40 seconds.  With TCP, the automatic retry will turn that into  
approximately one tick every 10.X seconds, where X is entirely  
dependent on latency (and probably very small).  Does TCP do this with  
more traffic?  Yes.  However, it does it with exponential backoff, a  
window to limit the outstanding number of packets, PMTU support so  
that the packets don't get fragmented, an RST mechanism to break  
connections if the remote host has rebooted, the option to use SSL to  
encrypt the session, etc.

There are almost no cases that actually demand UDP that a single TCP  
connection doesn't do very well.  I'd strongly recommend not ignoring  
its benefits and realizing that real network conditions almost never  
favor UDP and that UDP does not favor a simple implementation.

On Sep 25, 2009, at 4:26 AM, Valentin Micic wrote:

> I beg to differ -- my take is that TCP reliability is a part of the  
> problem
> in this case. Whilst buffering and flow control is important for,  
> say, file
> transfer, it is completely irrelevant for TCIK and health-checks (So  
> what if
> it doesn't get there, I can send it again without any consequence!).
>
> Argument about UDP unreliability sounds more like a mantra than a  
> proper
> argument (if only I got a penny every time I've heard it (-:). There  
> are
> only two fundamental differences (*) between TCP and UDP... actually  
> only
> one, because the second is conditioned by the first: TCP supports  
> stream,
> whilst UDP message-bound communication; thus, as a consequence, TCP  
> requires
> some form of flow control to support stream processing.
>
> In this particular case: what possible benefit one can derive from  
> sending a
> message over the stream as opposed to sending just a message? If the  
> message
> is short enough to fit in a datagram -- none!
>
> As for ability to send urgent data (OOB) over TCP socket -- data  
> streams and
> OOB data are mixing like oil and water. I am yet to see a successful
> utilization of OOB (issued by a user) that hasn't resulted in  
> connection
> reset (or system shutdown (-;).
>
> Lastly, if TICK is implemented via separate TCP socket, that would  
> double
> networking resources required -- you'd need a new socket for every  
> node
> you're connected to. With UDP, all you need is one socket, and a  
> very basic
> protocol:
>
> 	1) Ask when you have to;
> 	2) Answer when asked.
>
> Mind you, net-kernel is already doing this.
>
> V/
>
> (*) If one disregards things that UDP can which TCP cannot do, such  
> as a
> multi-drop, multicasting, etc.
>
> -----Original Message-----
> From: erlang-questions@REDACTED [mailto:erlang- 
> questions@REDACTED] On
> Behalf Of Jayson Vantuyl
> Sent: 25 September 2009 12:25 PM
> To: Erlang-Questions Questions
> Subject: Re: [erlang-questions] running without net tick
>
> Short Version:
>
> Why not open a special "tick" TCP port?  UDP would require a reliable
> delivery implementation.  TCP saves quite a bit of work in that regard
> (and gets a lot of important but subtle things right).
>
> Long Version:
>
> Also, never say never.
>
> Actually, you CAN send out-of-band data (also called urgent data)
> using TCP.  The original "WinNuke" (i.e. ping-of-death for Windows 95)
> was due to having a corrupt OOB header in a TCP packet.  In classic
> Microsoft / Internet style, the issue was further confused because it
> was an Out-of-Bounds bug, so a generation of networking consultants
> have minor deviations in their interpretations of the meaning of the
> letters OOB.
>
> As for TCP Urgent Data / OOB, it seems to be specified well enough at
> the protocol level, but iit doesn't appear to be handled uniformly in
> different socket implementations.
>
> Under Linux, you use send/recv with the MSG_OOB option (or set the
> SO_OOBINLINE socket option to just inline the data).  It appears to
> try to keep it at a certain point in the data stream (i.e. to preserve
> some of the ordering) and certain conditions can cause it to become
> part of the "normal" stream of data.  It also can cause some odd
> signals to be delivered to the process.  Still, TCP *does* have OOB
> data support, just maybe it isn't easily usable everywhere.
>
> On Sep 25, 2009, at 3:04 AM, Valentin Micic wrote:
>
>> You may change TICK value all day long, but if the underlying
>> infrastructure
>> s in some kind of trouble, that alone is not going to solve the
>> problem.
>>
>> The following is just a speculation, but quite plausible in my mind:
>>
>> AFAIK, ERTS is multiplexing inter-nodal traffic over a single
>> socket. Thus,
>> if the socket is heavily utilized, the sending buffer may get
>> congested due
>> to dynamically reduced TCP window size (because remote side is not
>> flushing
>> its buffer fast enough -- if the same process is reading and writing
>> the
>> socket, this may cause a deadlock under a heavy load). As much as I
>> am not
>> certain about particular implementation here, I know that sender
>> will not
>> wait for ever -- it will eventually timeout and this (exception?)
>> has to be
>> handled somehow by the sender. The reasonable course of action would
>> be to
>> reset the connection. If and when that happens, node can be declared
>> unreachable; therefore the "net-split" may occur. In other words,
>> net-split
>> may occur with or without "ticker" process running and regardless of
>> the
>> real network availability (*).
>>
>>
>> I think the net-tick method is good on its own, however, it is
>> utilizing a
>> *wrong* transport! IMO, tick should be handled as out-of-band data,
>> and this
>> cannot be done using TCP/IP (well, at least not at the user level).  
>> My
>> suggestion would be to use UDP for net-kernel communication
>> (including TICK
>> messages). This way one would be able to find out about peer health
>> more
>> reliably (yes, a small protocol may be required, but that's  
>> relatively
>> easy).
>>
>> To make things simpler regarding the distribution, one may use the
>> same port
>> number as advertised in EPMD for a particular node, hence bind UDP
>> socket to
>> that number.
>>
>> V/
>>
>> (*) I've seen "net-splits" between nodes collocated on the same
>> machine --
>> therefore indicating TCP buffer/load related issue. Maybe situation
>> may be
>> improved by creation of more than one connection between two nodes,
>> but that
>> may come with a bag of problems on its own.
>>
>>
>> -----Original Message-----
>> From: erlang-questions@REDACTED [mailto:erlang-
>> questions@REDACTED] On
>> Behalf Of Ulf Wiger
>> Sent: 25 September 2009 09:13 AM
>> To: erlang-questions Questions
>> Subject: [erlang-questions] running without net tick
>>
>>
>> The problem of netsplits in Erlang comes up now and again.
>> I've mentioned that we used to have a more robust
>> supervision algorithm for device processor monitoring in
>> AXD 301...
>>
>> I read the following comment in kernel/src/dist_util.erl
>>
>> %% Send a TICK to the other side.
>> %%
>> %% This will happen every 15 seconds (by default)
>> %% The idea here is that every 15 secs, we write a little
>> %% something on the connection if we haven't written anything for
>> %% the last 15 secs.
>> %% This will ensure that nodes that are not responding due to
>> %% hardware errors (Or being suspended by means of ^Z) will
>> %% be considered to be down. If we do not want to have this
>> %% we must start the net_kernel (in erlang) without its
>> %% ticker process, In that case this code will never run
>>
>>
>> ...and thought: promising - it is then possible to experiment
>> with other tick algorithms?
>>
>> However, looking at net_kernel.erl:
>>
>> init({Name, LongOrShortNames, TickT}) ->
>>    process_flag(trap_exit,true),
>>    case init_node(Name, LongOrShortNames) of
>>        {ok, Node, Listeners} ->
>>            process_flag(priority, max),
>>            Ticktime = to_integer(TickT),
>>            Ticker = spawn_link(net_kernel, ticker, [self(),
>> Ticktime]),
>>
>> In other words, you can't set net_ticktime to anything other
>> than an integer (and it has to be a smallint, since it's used
>> in a receive ... after expression.
>>
>> (To do justice to the comment above, couldn't a net_ticktime
>> of, say, 0 turn off net ticking altogether?)
>>
>> What one can do then, is to set net_ticktime to a very large
>> number, and then run a user-level heartbeat. If netsplits are
>> still experienced without visible problems in the user-level
>> monitoring, or perhaps even serviced traffic during this
>> interval, then something is definitely wrong with the tick
>> algorithm. :)
>>
>> BR,
>> Ulf W
>> -- 
>> Ulf Wiger
>> CTO, Erlang Training & Consulting Ltd
>> http://www.erlang-consulting.com
>>
>> ________________________________________________________________
>> erlang-questions mailing list. See http://www.erlang.org/faq.html
>> erlang-questions (at) erlang.org
>>
>>
>> ________________________________________________________________
>> erlang-questions mailing list. See http://www.erlang.org/faq.html
>> erlang-questions (at) erlang.org
>>
>
>
>
> -- 
> Jayson Vantuyl
> kagato@REDACTED
>
>
>
>
>
>
> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org
>



-- 
Jayson Vantuyl
kagato@REDACTED







More information about the erlang-questions mailing list