[erlang-questions] gen_sctp: What delays SACK?

Oliver Korpilla Oliver.Korpilla@REDACTED
Wed Nov 14 12:25:56 CET 2018

Hello, Andreas.

We also see retransmission and they do have an impact on latency. (We also see retransmissions on another link we're not testing with BEAM.)

But my biggest concern is this sudden stop of all sending from the C++ side that I can not wrap my head around.

Thank you very much,

Gesendet: Mittwoch, 14. November 2018 um 11:20 Uhr
Von: "Andreas Schultz" <andreas.schultz@REDACTED>
An: "Oliver Korpilla" <Oliver.Korpilla@REDACTED>
Cc: "Jesper Louis Andersen" <jesper.louis.andersen@REDACTED>, "Erlang (E-mail)" <erlang-questions@REDACTED>
Betreff: Re: [erlang-questions] gen_sctp: What delays SACK?

Hi Oliver,
One reason for you problems might be the fact that Erlangs SCTP implementation is unbelievable slow.
Apparently SCTP SACKs are only send after the application has actually received (on the recv syscall) the payload.
So,what I'm seeing is that if your application takes to long to read the payload, the SACK will not be send and the sender will start retransmitting the SCTP packet.
I have written a simple socket benchmark client [1].
When testing with TCP, UDP and SCTP on OTP 21.1, sending 1000 packets with a length of 100 bytes, I get this results:

* server is socat to /dev/null, e.g. "socat -u SCTP-LISTEN:6000,reuseaddr,keepalive,rcvbuf=131071,reuseaddr /dev/null")

* times are in microseconds

$ ./bench.escript sctp 6000 100 1000
As you can see TCP is fastest with ~8ms due to the Nagle algorithm combining the packets before sending, UDP is still ok with ~17.5ms. SCTP takes a astonishing 817ms. That is 46 times slower than UDP.
[1]: https://gist.github.com/RoadRunnr/53c19861aa4e4fa5bd45c072727ab971
Oliver Korpilla <Oliver.Korpilla@REDACTED[mailto:Oliver.Korpilla@REDACTED]> schrieb am Di., 13. Nov. 2018 um 18:01 Uhr:
Hello, Jesper.

The problem I see that the C++ side just fails to send more messages back but I'm stumped why.

It _looks_ like it fails to respond to my protocol requests for some reason.

But does it really? Or is something blocking/buffering/delaying/missing in the stack? And which side causes it?

I'm very very stumped. Because I've seen the tcpdump in Wireshark and C++ stops sending. It just stops. (If I had more trust in my SCTP knowledge I would _assume_ there's some sort of deadlock on the C++ side.)

Thank you very much,

Gesendet: Dienstag, 13. November 2018 um 15:51 Uhr
Von: "Jesper Louis Andersen" <jesper.louis.andersen@REDACTED[mailto:jesper.louis.andersen@REDACTED]>
An: "Oliver Korpilla" <Oliver.Korpilla@REDACTED[mailto:Oliver.Korpilla@REDACTED]>
Cc: "Erlang (E-mail)" <erlang-questions@REDACTED[mailto:erlang-questions@REDACTED]>
Betreff: Re: [erlang-questions] gen_sctp: What delays SACK?

Use tcpdump(1) on the flow and look for who is adding the latency. Usual rule of protocol debugging is to start at the lowest level and verify each level as you go up. Because then you have an audit trail of the events that happened which can inform you at a higher level.

On Tue, Nov 13, 2018 at 10:15 AM Oliver Korpilla <Oliver.Korpilla@REDACTED[mailto:Oliver.Korpilla@REDACTED][mailto:Oliver.Korpilla@REDACTED[mailto:Oliver.Korpilla@REDACTED]]> wrote:Hello.

We're using an elixir application as a sort of protocol tester. It communicated with the system-under-test over SCTP as a transport.

We're observing delay and unsent messages and due to the nature of the SCTP protocol we're not sure which side causes the issue.

The BEAM side has the NO_DELAY option set and pumps a burst of messages but then waits for responses (so it will not burst indefinitely, it burst once and then respond).

The C++ application has the DELAYED_SACK option set - we tried with both sack_freq 1 (which supposedly disables the algorithm) and higher (the default in our system).

(We also increased the receive window on both sides to ensure that senders would not block.)

But we're stumped. The C++ side is not responding at some point. When we did an actual target test once and we saw SCTP messages sent from system-under-test just stop when analyzing the tcpdump of the interfaces - C++ application has not emitted something on the wire and respectively nothing is received.

Our latest area of inquiry is to find out if maybe the elixir part is simply not getting scheduled - but can this impact for example SACK latency? Who acknowledges a message - the SCTP stack by itself or the application? And will the protocol block the sender until SACK?

I'm sorry for asking such vague questions but SCTP know-how is spread thin in our outfit and we're not the experts...

Thank you,
erlang-questions mailing list
erlang-questions mailing list

Dipl.-Inform. Andreas Schultz

----------------------- enabling your networks ----------------------
Travelping GmbH                     Phone:  +49-391-81 90 99 0
Roentgenstr. 13                     Fax:    +49-391-81 90 99 299
39108 Magdeburg                     Email:  info@REDACTED[mailto:info@REDACTED]
GERMANY                             Web:    http://www.travelping.com[http://www.travelping.com]
Company Registration: Amtsgericht Stendal        Reg No.:   HRB 10578Geschaeftsfuehrer: Holger Winkelmann          VAT ID No.: DE236673780

More information about the erlang-questions mailing list