Performance of selective receive

Pascal Brisset pascal.brisset@REDACTED
Mon Nov 14 02:33:59 CET 2005

Sean Hinde writes:
 > Yes, of course. You should use {active, once}. I will update the  
 > tutorial one day.

OK, good point.

 > No. This is only true if the server actually blocks. By the use of  
 > gen_server:reply/2 you can make every caller believe that it has made  
 > exclusive use of the server, but in fact the server can simply pass  
 > the messages straight through (maybe updating some local internal  
 > state on the way through).

Agreed.  But this boils down to passing requests to the backend
as fast as possible, hot-potato style, and hoping that it can
cope with lots of pending requests itself.

 > Yes, that's fine, that is most likely required in such scenarios.  
 > This is a different case to waiting for many seconds for some other  
 > system to respond. The erlang scheduler is fair given half a chance.  
 > In this case that means replacing the cast to your internal server  
 > with a call.

So here is another demo with synchronous calls instead of casts:
1000 servers, each sending 10 synchronous requests to the server
every second.  On my PC it starts at 10 % CPU load and 10000 msg/s.
Then the backend is paused for one second.  Afterward, the program
stabilizes at 100 % CPU and 600 msg/s, with a huge message queue.
Maybe that's a problem with my system.  Someone please confirm.

 > It is pretty unlikely that all 1000 servers sent their  
 > requests at exactly the same moment, and even if they did, the system  
 > would recover quickly, not spiral into meltdown as it would in the  
 > async case.

I claim that if the server loop has several receive statements,
one of which is a selective receive, then as the message queue grows,
each loop becomes more expensive.  If this extra burden exceeds whatever
spare CPU capacity the system had initially, it may fail to recover.

 > No. If that happens the problem is that the Erlang program is not  
 > correctly designed.

I agree.  I am only saying that this design error is not trivial.
Once you are aware of it, a lot of strange fluctuations in CPU load
begin to make sense, and you can fix things.  It would be even better
if we could optimize selective receive so as to remove the mere
possibility of running into trouble.

-- Pascal

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: syncsnowball.erl
URL: <>

More information about the erlang-questions mailing list