[erlang-questions] SCTP support (was EPIPE and TCP sockets)

Mon Oct 15 12:54:57 CEST 2007

Per Hedeland wrote:
> Bruce Fitzsimons <Bruce@REDACTED> wrote:
>   
>> http://en.wikipedia.org/wiki/Head-of-line_blocking.
>>     
>
> Hm, I can't quite see how that relates to TCP vs SCTP. Presumably you're
> not actually referring to the situation in a network switch (i.e. layer
> 2 device), since the upper-layer protocol won't make a difference there?
> I can see how this phenomenon could occur with "naive" multiplexing of
> multiple "channels" on a single TCP session, like, um, the Erlang
> distribution does (though I can't see it happening with Erlang
> distribution since the "channels", i.e. inter-process messages, won't be
> blocked at the receiving end) - is that the connection?
>
> Of course this can be handled by per-channel flow control (like e.g. SSH
> does) when using TCP, but I guess this user-level complexity can be
> avoided by using SCTP instead.
>   
I agree. I wasn't specifically saying that head-of-line blocking 
impacted XMPP, just that SCTP can use streams to avoid it rather than 
using yet another socket (a limited resource for a server) and managing 
them at the app level. I've seen enough buggy implementations of 
application layer flow control, keep-alives etc.

The idea with SCTP is not that the application layer can't do these 
things using existing protocols but that one decent implementation at a 
lower layer can make it easier on everyone. Back in the real world 
though I have not seen a real-world use case of SCTP streams (e.g >1 
stream per association) or multihoming yet as the applications that I've 
seen using it (SIGTRAN) have all this complexity in the upper layers at 
least once :-) I'm looking forward to playing with DIAMETER(/DCCA) to 
see if it is any different (SCTP support is mandated I believe).
> The 2 hours would be for the (misnamed) keep-alive functionality in TCP
> - if there actually is output data waiting to be ack'ed, the "standard"
> timeout before the connection is declared dead is generally much
> shorter, on the order of 5 minutes or so - though of course that can
> still be "way too long" in some cases.
>
>   
Yes you are right. The particular corner case that impacts XMPP with the 
keep-alives is that it uses the TCP connection status (server<->client) 
as an application layer indicator that the client is still there. So the 
last-advised state ("available to chat") of a dead client stays visible 
to all parties for quite some time if noone send the dead one any 
messages. Sending application-layer keep-alives helps, but the last XMPP 
extension proposal I saw just did it one-way and relied on the TCP 
timeout. It works but I'm used to sub-second notification that links are 
out. All the timers can be shortened but they're still "way too long" 
for me.
>> and there is no way to tell what quantity of data was reliably delivered 
>> when the connection is finally terminated (Tony's problem).
>>     
>
> Yes, but that's the same story as we've frequently discussed here
> regarding whether Erlang distribution is "reliable" - if you *really*
> want *reliable*, knowing that the data was successfully received by the
> network stack at the destination is only the first chapter in that
> story...
>   
I agree. UDP can be just as "reliable" :-)

I think the discussion with Sean about putting Erlang distribution over 
SCTP is interesting, the multihoming might be useful for some 
applications where the node mesh needs to be maintained over a distance 
(or the internet).

Cheers,
Bruce