Performance of selective receive

Sun Nov 13 23:23:42 CET 2005

On 13 Nov 2005, at 18:18, Pascal Brisset wrote:

> Sean Hinde writes:
>> Alternative proposal:
>>
>> Never, but never use async messages into a blocking server from the
>> outside world. The way to avoid this is to make the socket process
>> (or whatever it is) make a synchronous call into the gen_server
>> process.
>
> Agreed, it's all about propagating flow-control end-to-end.
>
> Note that if the "socket process" uses {active,true}, it might
> itself be susceptible to the "snowball effect" - unless we are
> talking about a well-behaved network protocol with end-to-end
> flow control, in which case there is no need to worry at all.

Yes, of course. You should use {active, once}. I will update the  
tutorial one day.

>
> Also, consider a scenario with not one client (or socket process),
> but 1000.  Even if each client calls the server synchronously, the
> server can still have as much as 1000 requests in its message queue.
> That's enough to trigger a snowball effect.

No. This is only true if the server actually blocks. By the use of  
gen_server:reply/2 you can make every caller believe that it has made  
exclusive use of the server, but in fact the server can simply pass  
the messages straight through (maybe updating some local internal  
state on the way through).

If you use {active, once} in your sockets then they will all hold  
back more traffic from the network until they get a reply from the  
server. It is pretty unlikely that all 1000 servers sent their  
requests at exactly the same moment, and even if they did, the system  
would recover quickly, not spiral into meltdown as it would in the  
async case.

The limit of the system then becomes the number of clients and the  
real processing required from each one, not some unfortunate message  
storm.

>
>
>> You then have to make the gen_server itself not block. The
>> neatest way is to spawn a new process to make the blocking call to
>> the backend, and use the gen_server:reply/2 mechanism to reply later.
>> You could also use a pool of backend processes and reject if you hit
>> "congestion".
>
> Sometimes you just can't process requests asynchronously.
> In our case, the server was a session manager which *must*
> read and update a local mnesia table of active sessions
> before it can process the next message.

Yes, that's fine, that is most likely required in such scenarios.  
This is a different case to waiting for many seconds for some other  
system to respond. The erlang scheduler is fair given half a chance.  
In this case that means replacing the cast to your internal server  
with a call.

Under normal traffic the sync call sends twice as many messages. But  
you don't need to worry about that as you are not overloaded. When  
you are overloaded making this synchronous saves you 100% from these  
message storms.

I looked into a system once which suffered from this. Each client  
made 5 async sends to a central logging server before yielding. After  
a certain load the system bogged down. By simply making these  
requests syncronous the overall throughput of the system greatly  
increased and it remained consistent under sustained load.

>
> As Ulf highlighted, there are blocking calls hidden everywhere.
> There's nothing wrong with that, as long as your system is
> dimensioned correctly.  The problem is that a 10 % CPU load can
> suddenly turn into 100 % if you add a few thousand messages in
> the wrong message queue.

No. If that happens the problem is that the Erlang program is not  
correctly designed. Telephone systems have to handle 95% of maximum  
supported load while being attacked with 100 times that load.

Erlang is designed for telephone systems by the biggest supplier of  
telephone systems in the world. It works.

Sean