[erlang-questions] Message-sending performance

David King <>
Mon Sep 10 01:26:00 CEST 2007


> The problem here is that you are insisting on receiving messages by  
> a given order: {msg, X} with X <- [1..N].  Messages sent become  
> interleaved in the mailbox; when you try to remove, say {msg,  
> 40000} from the 1st process, there are already 10000s of messages  
> in the mailbox from the other process before that one, that must be  
> scanned; receive becomes very slow.

Ah, I can see that, and I can see why, as now it is the messages out  
of order, so expecting to be able to receive them in order is silly

> This is one of the problems one must be keeping an eye on: beware  
> the size of the mailbox. In your problem you will see that if you  
> replace receive by
>                       receive {msg, _} -> ok end
> the program will execute extremely fast, as with 1 process.

Yes, it does, now sending with two processes is just as fast as  
sending with one (and when I actually have work to do as I receive  
them, it will be faster). And in my use-case, receiving the messages  
out-of-order isn't a problem at all.

Thank you for the advice.


> David King wrote:
>> I've noticed that I can send 100,000 messages from one process to   
>> another very quickly (less than a second), but if I have two  
>> process  send 50,000 messages each to a given process, it receives  
>> them very  slowly (in my test, 36s). I found this using the  
>> mapreduce  implementation in Joe's book where potentially  
>> thousands of processes  are spawned and many (potentially very  
>> small) messages are sent.
>> I assume that this is because the message queue is locked while   
>> sending and receiving messages. Is there any way to work around  
>> this  so that having multiple processes sending many messages each  
>> isn't so  slow? In my case I'd like the sender and receiver to be  
>> working  simultaneously, so having each sender process send one  
>> large list  isn't quite as efficient
>> Here it is with one process:
>> ()11> message_streamer:stream_messages(100000).
>> Receiving 10000 at 63356401790
>> Receiving 20000 at 63356401790
>> Receiving 30000 at 63356401790
>> Receiving 40000 at 63356401790
>> Receiving 50000 at 63356401790
>> Receiving 60000 at 63356401790
>> Receiving 70000 at 63356401790
>> Receiving 80000 at 63356401790
>> Receiving 90000 at 63356401790
>> Done sending 100000 messages
>> Receiving 100000 at 63356401790
>> 100000 messages in 1s, 1.00000e+5 msg/s
>> And here it is with two processes:
>> ()9> message_streamer:stream_messages(100000).
>> Receiving 10000 at 63356401639
>> Receiving 20000 at 63356401643
>> Receiving 30000 at 63356401650
>> Receiving 40000 at 63356401660
>> Receiving 50000 at 63356401673
>> Done sending 50000 messages
>> Receiving 60000 at 63356401673
>> Receiving 70000 at 63356401673
>> Receiving 80000 at 63356401673
>> Receiving 90000 at 63356401673
>> Receiving 100000 at 63356401673
>> Done sending 50000 messages
>> 100000 messages in 36s, 2777.78 msg/s
>> In the multiple-process case, it actually slows down as more  
>> messages  are sent until one of the processes completes, and then  
>> it receives  them all very quickly. Here's the code:
>> --- code begins ---
>> -module(message_streamer).
>> -compile(export_all).
>> stream_messages(N) ->
>>    Self=self(),
>>    Start=myapp_util:now(),
>>    List=lists:seq(1,N),
>>    Split_List=split_list(List,erlang:system_info(schedulers)),
>>    lists:foreach(fun(Sublist) ->
>>                      spawn_link(fun() ->
>>                          lists:foreach(fun(Which) ->
>>                                            Self ! {msg,Which}
>>                                       end,
>>                                     Sublist),
>>                          io:format("Done sending ~p messages~n",  
>> [length(Sublist)])
>>                        end)
>>                    end, Split_List),
>>    lists:foreach(fun(Which) ->
>>                      case Which rem (N div 10) of
>>                          0 ->
>>                            io:format("Receiving ~p at ~p ~n",  
>> [Which,myapp_util:now()]);
>>                          _ -> ok
>>                        end,
>>                      receive {msg,Which} -> ok end
>>                    end,
>>                  List),
>>    Time=case myapp_util:now()-Start of
>>             0 -> 1;
>>             X -> X
>>           end,
>>    io:format("~p messages in ~ps, ~p msg/s~n",[N,Time,N/Time]).
>> %% splits a list into equal parts
>> %% split_list([1,2,3,4,5,6],2) -> [[1,2,3],[4,5,6]]
>> split_list(List,Pieces) ->
>>    Length=length(List),
>>    lists:reverse(split_list(List,Length div Pieces,Length,[])).
>> split_list([], _Per_Piece, 0, Acc) ->
>>    Acc;
>> split_list(List, Per_Piece, Length, Acc) when Length>=Per_Piece ->
>>    {Short,Long} = lists:split(Per_Piece,List),
>>    split_list(Long, Per_Piece, Length-Per_Piece, [ Short | Acc ]);
>> split_list(List, Per_Piece, Length, Acc) when Length <Per_Piece ->
>>    {Short,Long} = lists:split(Length, List),
>>    split_list(Long, Per_Piece, Length-Length, [ Short | Acc ]).
>> --- code ends ---
>> myapp_util:now() just looks like:
>> now() ->
>>    calendar:datetime_to_gregorian_seconds(calendar:universal_time()).
>> _______________________________________________
>> erlang-questions mailing list
>> 
>> http://www.erlang.org/mailman/listinfo/erlang-questions
>
>
> -- 
> Paulo Sérgio Almeida                             Email:  
> 
> Dep. Informatica - Universidade do Minho         Phone: +351 253  
> 604451
> Campus de Gualtar - 4710-057 Braga - PORTUGAL    Fax  : +351 253  
> 604471




More information about the erlang-questions mailing list