20k messages in 4s but want to go faster!

Mon Jul 13 00:25:31 CEST 2009

I'd be careful about going on a wild goose chase (writing drivers
etc). You need to get solid data about how fast you are actually
sending the messages right now. My gut feeling is you are measuring
too much and not just the message transmission. ie out of the 4
seconds I'll bet that 80% of that is the time taken to tear down the
processes and send the DOWN message. and the rest is the actual
transmission latency (which is what you are really interested in).

I reran my TCP  tests, all running on the same dual processor Intel/
Linux machine, with the tester running in one VM and the server
running on another VM. I can broadcast (actually technically unicast)
to 8,000 clients with max latency 277ms, and an average of 192ms. This
scales almost linearly (I tested 4,000 as well which was about half
the time) so by extrapolation 20,000 messages should take about 700ms
worst case and average under 500ms. If the clients were distributed
and the server distributed I bet it could be faster. I am measuring
the pure latency of the messages and no other overheads.

Before writing all sorts of optimizations you really need to be
confident of your test metrics. I'd test on a local machine first, get
a reading, then try on your EC2 instances and see if there are any
significant differences.

Just my 2cents worth ;)

On Jul 12, 12:49 pm, Joel Reymont <joe...@REDACTED> wrote:
> On Jul 12, 2009, at 7:38 PM, Rapsey wrote:
>
> > Well I have a similar problem with my streaming server. I suspect  
> > the main
> > issue with using gen_tcp is that every gen_tcp:send call will  
> > involve a
> > memory allocation and memcpy.
>
> Why do you think so? I thought binaries over 64 bytes are not copied.
>
> > My server needs to be able to fill up at least a gigabyte connection  
> > and
> > this involves a very large number of gen_tcp:send calls. The only  
> > way I
> > could achieve that number is by writing my own driver for data output.
>
> Did you write your own driver already?
>
> What kind of throughput were you getting with your old Erlang code?
>
> > This means bypassing gen_tcp completely and writing the socket  
> > handling
> > stuff by hand. Basically whenever I need to send the data, I loop  
> > through
> > the array of sockets (and other information) and send from one  
> > single buffer
> > (the sockets are non-blocking).
>
> I'll take a closer look at the TCP driver but other Erlang internals I  
> looked at use iovecs and scatter-gather IO (writev, etc.).
>
> ---
> Mac hacker with a performance benthttp://www.linkedin.com/in/joelreymont
>
> ________________________________________________________________
> erlang-questions mailing list. Seehttp://www.erlang.org/faq.html
> erlang-questions (at) erlang.org