Performance of selective receive

Sun Nov 13 19:04:41 CET 2005

Wow, that's really interesting, thanks for the writeup. In non-erlang 
systems where we've had similar issues, the "best" solution we found was 
to use end-to-end flow control. Keep messages on the sender side until 
your system can handle the load. CPU load was one of the considerations 
used in deciding if work could be accepted, as was memory, and several 
others.  Most other solutions simply mask the problem until some other 
horrible scenario comes up. By keeping work on the sender side you may 
also be able to take advantage of certain optimizations, like combining 
work or canceling work. You also minimize dealing with drops and drop 
handling.

Having unconstrained senders turns out to be a gun that can fire at anytime.

Pascal Brisset wrote:

>Erlang's selective receive semantics definitely has advantages
>w.r.t. the FIFO communications found in other models.
>But it also comes with a performance penalty which has bitten us
>once in a production system.  Ulf Wiger's presentation at EUC
>last week reminded me of it.  Consider the following scenario:
>
>
>A server process is receiving requests at a constant rate.  For
>each request, the server calls a backend service synchronously:
>
>    server(State) ->
>        receive Msg ->
>            NewState = gen_server:call(backend, {State,Msg}),
>            server(NewState)
>        end.
>
>The system is dimensioned so that the CPU load is low (say 10 %).
>Now at some point in time, the backend service takes one second
>longer than usual to process one particular request.
>You'd expect that some requests will be delayed (by no more than
>one second) and that quality of service will return to normal
>within two seconds, since there is so much spare capacity.
>
>Instead, the following can happen: During the one second outage,
>requests accumulate in the message queue of the server process.
>Subsequent gen_server calls take more CPU time than usual because
>they have to scan the whole message queue to extract replies.
>As a result, more messages accumulate, and so on.
>
>snowball.erl (attached) simulates all this.  It slowly increases
>the CPU load to 10 %.  Then it pauses the backend for one second,
>and you can see the load rise to 100 % and remain there, although
>the throughput has fallen dramatically.
>
>
>Here are several ways to avoid this scenario:
>
>- When measuring the actual capacity of a system, always simulate
>  all sorts of random delays and jitters everywhere.  But in the
>  example above, this would mean taking a huge security margin.
>
>- Explicitly detect the overload and reject requests when it occurs.
>  This is the standard way of handling overloads.  But it's hard to
>  justify losing calls on a system that is running at 10 % CPU load.
>  Plus, the overload develops so quickly that you need to flush the
>  whole message queue in order to return to normal reasonably fast.
>
>- Add a proxy process dedicated to buffering requests from clients
>  and making sure the message queue of the server remains small.
>  This was suggested to me at the erlounge.  It is probably the
>  best solution, but it complicates process naming and supervision.
>  And programmers just shouldn't have to wonder whether each server
>  needs a proxy or not.
>
>- Modify Erlang to support multiple message queues per process.
>  Essentially, this is what the buffering proxy above emulates.
>
>- Optimize the implementation of selective receive in the emulator.
>  E.g. each message queue could have some kind of cache of the latest
>  X receive patterns which have failed against the first Y messages
>  of the queue.  This is hard to implement in a useful way because
>  of patterns with bound variables (see gen:wait_resp_mon/3).
>
>- Modify Erlang to add a "blocking send" primitive, or emulate it
>  with rpc:call(process_info(Server,message_queue_len)), so that
>  flow-control propagates end-to-end (ultimately, to some socket
>  or file I/O).  But bounding queues may introduce deadlocks.
>
>Any other ideas ?
>
>-- Pascal
>
>
>  
>
>------------------------------------------------------------------------
>
>%% Demonstration of the snowball effect triggered by a transient
>%% overload and selective receive on a long message queue.
>%% Assumes /proc/stat in linux-2.6.3 format.
>
>%% Increate InitialPeriod if your system is too slow.
>
>-module(snowball).
>-compile(export_all).
>
>run() ->
>    {ok,_} = gen_server:start_link({local,backend}, ?MODULE, 0, []),
>    Server = spawn_link(?MODULE, server, [undefined]),
>    spawn_link(?MODULE, monitor, [Server, get_ticks(), 0]),
>    InitialPeriod = 500,
>    Client = spawn_link(?MODULE, client, [Server, InitialPeriod]),
>    receive after 2000 -> ok end,
>    incr_load(Server, Client, get_ticks()),
>    receive after 10000 -> ok end,
>    gen_server:call(backend, pause),
>    receive after 60000 -> ok end,
>    erlang:halt().
>
>incr_load(Server, Client, {PrevBusy,PrevTotal}) ->
>    receive after 10000 -> ok end,
>    {Busy,Total} = Ticks = get_ticks(),
>    Load = (Busy-PrevBusy) * 100 div (Total-PrevTotal),
>    io:format("Average load was ~w %.~n", [Load]),
>    if Load < 10 -> Client ! speedup, incr_load(Server, Client, Ticks);
>       true -> io:format("~nModerate load reached.~n~n")
>    end.
>
>monitor(Server, {PrevBusy,PrevTotal}, PrevCount) ->
>    receive after 1000 -> ok end,
>    {_, MQL} = process_info(Server, message_queue_len),
>    {Busy,Total} = Ticks = get_ticks(),
>    Count = gen_server:call(backend, read),
>    Load = (Busy-PrevBusy) * 100 div (Total-PrevTotal),
>    io:format("queue_length=~w  load=~w %  rate=~w/s~n",
>	      [MQL, Load, Count-PrevCount]),
>    monitor(Server, Ticks, Count).
>
>get_ticks() ->
>    {ok, F} = file:open("/proc/stat", [read]),
>    case io:fread(F, '', "~*s ~d ~d ~d ~d") of
>	{ok, [T1,T2,T3,T4]} -> file:close(F), {T1+T2+T3, T1+T2+T3+T4};
>	_ -> exit(unsupported_proc_stat_format)
>    end.
>
>utime() ->
>    {MS,S,US} = now(),
>    (MS*1000000+S)*1000000 + US.
>
>%%%% Client.
>
>client(Server, Period) ->
>    io:format("~nClient sending ~w messages/s.~n~n", [1000000 div Period]),
>    client(Server, Period, utime()).
>
>client(Server, Period, Next) ->
>    Wait = lists:max([0, Next-utime()]) div 1000,
>    receive speedup -> client(Server, Period*4 div 5)
>    after Wait -> Server ! "hello", client(Server, Period, Next+Period)
>    end.
>
>%%%% Server.
>
>server(State) ->
>    receive Msg ->
>	    NewState = gen_server:call(backend, {State,Msg}),
>	    server(NewState)
>    end.
>
>%%%% Backend.
>
>init(State) ->
>    {ok, State}.
>
>handle_call(pause, _From, State) ->
>    io:format("~nBackend pausing for 1 second.~n~n"),
>    receive after 1000 -> ok end,
>    {reply, ok, State};
>handle_call(read, _From, State) ->
>    {reply, State, State};
>handle_call({S,_Msg}, _From, State) ->
>    {reply, S, State+1}.
>
>%%%%
>  
>