<div dir="ltr">Maybe this problem is related to how the application works.<div><br><div>Each node may host a shard of service.</div><div>Application contains a process (shard manager) which registers in global registry with a key like {shard, 2}.</div>


<div>Each shard contains thousands (millions planned) of processes, registered locally in gproc as {n, l, {worker, <<"worker_name">>}}</div><div>Access to the local worker (say, worker:access/2) is pair of gproc:lookup_pid and gen_server:call</div>


<div>Access to any worker is determining shard number (crc32(Name) rem Count), N = node(global:whereis_name({shard, 2})), rpc:call(N, worker, access, [...])</div><div><br></div><div>So there are two operations related to distribution — global:whereis_name and rpc:call.</div>


<div><br></div><div>Does any of them force sending of tcp-push? If yes, how can I change this behavior by cost of couple of millisecods latency?</div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">

On Tue, Jul 8, 2014 at 11:20 PM, Danil Zagoskin <span dir="ltr"><<a href="mailto:z@gosk.in" target="_blank">z@gosk.in</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi!<div><br></div><div>There is enormous packet-per-second in my distributed erlang setup.</div><div><br>


</div><div>Under heavy load in single distribution socket and in one direction tcpdump shows rates like 40..100 packets with tcp-push flag set per millisecond.</div>

<div>Size of majority of packets is 47..112 bytes while MTU on network interface is 8950 (jumbo-frames).</div><div><br></div><div>If distribution driver aggregated messages to fit MTU it would be 100 times less packets in network.</div>


<div>Given cluster of 3 nodes (very small) and both directions we get about extra 300K PPS which causes packet drops and tcp retransmits (thus increasing latency a lot).</div><div><br></div><div>Is it possible to make erlang distribution push packets less often (one millisecond would be enough for me)?</div>


<div><br></div><div>I'm sure this is not of net ticks because tracing on dist_util:con_loop process shows quite low {_, tick} message rate.</div><div><br></div><div>Erlang/OTP version used is 17.0.<span class="HOEnZb"><font color="#888888"><br clear="all">


<div>

<br></div>-- <br><div dir="ltr"><div><font face="'courier new', monospace">Danil Zagoskin | <a href="mailto:z@gosk.in" target="_blank">z@gosk.in</a></font></div></div></font></span></div></div>

</blockquote></div><br><br clear="all"><div><br></div>-- <br><div dir="ltr"><div><font face="'courier new', monospace">Danil Zagoskin | <a href="mailto:z@gosk.in" target="_blank">z@gosk.in</a></font></div></div>

</div>