[erlang-questions] TCP buffering

sean mcevoy sean.mcevoy@REDACTED
Fri Dec 5 23:47:19 CET 2014


Hi Tobias,
Thanks for the reply.
Our DB has a few small & fairly static config disc tables but the main
application data is in ram_copies and doesn't get dumped to disk.
It's nice to investigate something else though, I think I'm getting tunnel
vision with these buffer sizes! It's driving me mad :-P
//Sean.


On Fri, Dec 5, 2014 at 8:22 AM, Tobias Lindahl <tobias.lindahl@REDACTED>
wrote:

>
>
> 2014-12-04 22:31 GMT+01:00 sean mcevoy <sean.mcevoy@REDACTED>:
>
>> Hi List,
>>
>> I need some TCP help and advice on how to manage buffer sizes from the
>> gen_tcp api.
>>
>> We have a system made up of 4 basic node types, lets call the A, B, C &
>> DB (all running R15B), each of which can have multiple instances. We also
>> have a communications protocol that runs over tcp links between the
>> different node types and works fine on the connections between A & B and B
>> & C, but on the connections between B & DB we've been getting some strange
>> behaviour.
>>
>> DB is a node that basically just runs mnesia and is the data store for
>> the system, if that's relevant, and connections to it also work fine for a
>> few days after it restarts. But after a few days we seem to get "chokes" in
>> the TCP communications at very regular 7 minute intervals. The rest of the
>> VM stays working but messages in the TCP link take up to 8 seconds to reach
>> their destination, causing timeouts on the higher level protocol.
>> These "chokes" are regular across peak & quiet times and cause a similar
>> proportion of timeouts regardless of the traffic level. (Traffic comprises
>> of simple non-blocking requests and responses)
>>
>> I've been investigating and have become focussed on the tcp buffer
>> sizing, though I've no concrete evidence that this is actually the problem
>> and my TCP knowledge before this investigation was more or less restricted
>> to what's exposed through gen_tcp. So please advise if you think there may
>> be another source.
>>
>
> Since you are running mnesia on the node, I would look for correlation
> between mnesia table dumps and the chokes. It might not be the culprit of
> the problem, but it might be the trigger. Seven minutes sounds about right
> for mnesia dumps, and depending on the data in your disc_copies table, the
> dump can cause pretty bad behavior on schedulers and io, affecting
> seemingly unrelated processes. I've seen nodes behaving very badly because
> of this.
>
> If you find the correlation to mnesia dumps, you could try setting the
> scheduler wake up threshold (+swt) to low or very_low and the scheduler
> forced wake up interval (+sfwi) to some nice number (1000 ms has worked for
> me), to make sure you are not starving the processes receiving the tcp
> communication.
>
> I can't comment on the tcp buffers, but at least you have someting else to
> look for as well.
>
>
>> What I've found is that on initial connection both sndbuf & recbuf are
>> set to 10MB, and after a few days when we see these problems TCP has
>> resized them down to 49KB. On the other links where there are no problems
>> the buffers still have their original sizes. But for some reason
>> inet:setopts won't resize these 49KB buffers in the live site the way it
>> will in my test environment.
>>
>> And just now I've discovered the separate buffer parameter that I didn't
>> know about before, from the OTP docs this one should be larger than the
>> larger of sndbuf & recbuf but on my problematic link I have these values:
>> [{buffer,1460},{sndbuf,49152},{recbuf,49640}].
>> In my "good" links this is set to 10MB, just like sndbuf & recbuf, even
>> though we didn't explicitly set it.
>>
>> So my questions are:
>> - What governs this TCP resizing, I know it's in the protocol but what
>> traffic patterns might cause this?
>> - How can I resize my buffers once I'm in this state?
>> - Are the buffer sizes the likely cause of the "chokes" I'm observing?
>>
>> Thanks in advance!
>> //Sean.
>>
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20141205/58ac5406/attachment.htm>


More information about the erlang-questions mailing list