distributed performance test

Thu Apr 21 22:56:23 CEST 2005

Gentlemen,

I'd like to ask for your advice on how to troubleshoot a problem related 
to a severe performance drop of a gen_server:call() between a client and 
a server running on different hosts.

The server is implementing a gen_server behavior, and a client's code is 
shown below.

I've set up two nodes on different servers.   I use a test function
drp_client:test_async(ConcurrentClients, TimesToRunPerClient) for stress 
testing.  The function uses spawn/3 to create concurrent processes, 
where each one calls the drp_client:do_request/2 shown below.

When I call the server from the same node, the performance is:
- for a single client: 9434 calls/s
- for two clients....: 8771 calls/s (~ 4400 call/s per client)

(drp@REDACTED)6> drp_client:test_async(1, 1000).
Client  1 done. AvgTime=0.000 (9433.962 c/s)
(drp@REDACTED)7> drp_client:test_async(2, 1000).
Client  2 done. AvgTime=0.000 (4405.286 c/s)
Client  1 done. AvgTime=0.000 (4366.812 c/s)

----

When I call the server from a different host/node, the performance is:
- for a single client: 2538 calls/s
- for two clients....: 30   calls/s (~ 15 call/s per client)  <--- !!!

(n1@REDACTED)9> drp_client:test_async(1, 1000).
Client  1 done. AvgTime=0.000 (2538.071 c/s)
(n1@REDACTED)10> drp_client:test_async(2, 100).
Client  2 done. AvgTime=0.067 (15.020 c/s)
Client  1 done. AvgTime=0.067 (15.017 c/s)

For some reason we observe a significant performance drop in case of 
multiple clients issuing calls from a distributed node.

I tried to profile clients running on drp@REDACTED (local) and 
n1@REDACTED (remote) nodes, but results are very similar.

Now - the exciting part.  When I try to use the coverage tool on the 
server, and just compile a few modules using cover:compile_module/1, the 
invocation of clients on the remote node yields expected performance:

(drp@REDACTED)2> cover:start().
{ok,<0.115.0>}
(drp@REDACTED)3> lists:foreach(fun(M) -> cover:compile_module(M) 
end, [drp_proto, drp_router, drp_server]).
ok

(n1@REDACTED)24> drp_client:test_async(2, 500).
Client  2 done. AvgTime=0.001 (1483.680 c/s)
Client  1 done. AvgTime=0.001 (1457.726 c/s)

So, the two questions are:

1. What can cause such a significant performance drop in case of 
multiple concurrent clients accessing the server?

2. Why would conver:compile_module/1 eliminate this performance drop.

Your advice would be highly appreciated.

Regards,

Serge

drp_client:
===========
do_request(Args, Timeout) ->
     case get_server_pid() of
     {ok, Pid} ->
         case catch gen_server:call(Pid, Args, Timeout) of
         {ok, From, Response} ->
             {ok, From, Response};
         {error, Reason} ->
             {error, Reason};
         {'EXIT', {noproc, _Reason}} ->
             {error, server_not_running};
         {'EXIT', Reason} ->
             {error, Reason}
         end;
     {error, Reason} ->
         {error, Reason}
     end.

get_server_pid() ->
     case pg2:get_closest_pid(?MODULE) of
     Pid when is_pid(Pid) ->
         {ok, Pid};
     {error, {no_process, _}} ->
         {error, no_process}
     end.

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fprof_remote.txt
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20050421/843a19ed/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fprof_local.txt
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20050421/843a19ed/attachment-0001.txt>