20k messages in 4s but want to go faster!

Mon Jul 13 21:14:19 CEST 2009

On Jul 13, 1:59 am, Joel Reymont <joe...@REDACTED> wrote:
> > I think it is actually faster.
>
> Have you measured it?
>

No as my use case wouldn't work using that technique (I need to check
the state of each client before sending the tcp packet).

> > when the 8,000 or so gen_fsm processes get that message they do a  
> > tcp send to their client. That is getting the fastest throughput for me.
>
> How do you know it's getting the fastest throughput for you?

Well the fact that my server seems to be much faster than your
server ;)

>
> I won't believe doing an extra round trip through a gen_* server
> is faster than pushing a static chunk of binary data to a list of  
> sockets.

Actually I go through one gen_server and one gen_fsm for each message.

>
> Please prove me wrong!

I have a theory as to why it is so much faster, however proof will
require writing a lot of extra code :(

Basically the theory goes like this...
Using the list of sockets, you have to wait for every gen_tcp:send to
complete before sending the next one, ie every send is serialized.
Using a process per socket all the gen_tcp:send() are done in parallel
(well kind of depending on how many threads/processes you have) but I
suspect the gen_tcp:send has a yield in it once it is passed onto the
OS so they are interleaved at a minimum.

Sending 20,000 (or whatever) messages to 20,000 processes is extremely
fast as Erlang is optimized for that use case, so the iteration that
sends a message to each process will run a lot faster than an
iteration that does gen_tcp:send(). Now I use a cast when sending
these messages, were you using cast or call?

I effectively do this to get the packet to each process...

[ P ! packet || P <- Pids ]

where Pids is a list of socket processes.