How to get configuration data to a large number of threads?

Joe Armstrong <>
Wed Oct 27 13:54:56 CEST 2004


On Wed, 27 Oct 2004, Ulf Wiger (AL/EAB) wrote:

>
> An example of Joe's suggestion of moving the functions to
> the data can be found in the gen_leader contribution
> (at sourceforge). An example program using gen_leader is
> gdict -- a fully replicated version of dict. It does
> exactly what Joe proposes: broadcasts a fun which performs
> the same operation on all copies of the replicated dictionary.
>
> /Uffe

There are two alternatives

 	1) send the config data to everybody who needs it
 	2) send functions to a program that has a copy of the
 	   config data. Evaluate these functions locally and
 	   return the values.

If 2 overloads the config server then do as follows:
Make one registered process, that dispatches work to N worker
processes.

Like this (not tested)

 	config() ->
 	    C = initial_data()
 	    Pids = launch(10, C),
 	    loop(Pids, Pids).

 	launch(0) -> true;
 	launch(N, C) ->
 	    spawn(fun() -> worker(C) end),
 	    launch(N-1, C).

The master handles reconfigure and compute requests

 	loop([], Pids) ->
         	receive
 		   {done, Pid} ->
 			loop([Pid], Pids);
 		   {reconfig, C1} ->
 			foreach(fun(I) -> Pid ! {reconfig, C1} end, Pids),
 			loop([], Pids)
 		end;
 	loop(X=[H|T], Pids) ->
 		receive
 		   {done, Pid} ->
 			loop([Pid|X], Pids);
 		   {reconfig, C1} ->
 			foreach(fun(I) -> Pid ! {reconfig, C1} end, Pids),
 			loop(X, Pids);
 		   {compute, From Q} ->
 			H ! {compute, self(), F, Q},
 			loop(T, Pids)
 		end.

The workers do the work

          worker(Config) ->
 		receive
 		   {reconfig, C1} ->
 			worker(C1);
 		   {compute, Master, From, F} ->
 			From ! {Master, (catch F(Config))},
 			Master ! {done, self()},
 			worker(Config)
 		end.

   I  suspect  that this  is  the  quickest  method.  I  guess  sending
functions to  the data  is probably quicker  than sending data  to the
functions - BUT - you have to make sure that the functions do not bind
too  much of  the local  contexts  (ie they  should use  as few  local
variables from their environment as possible)

   If you think about it, you  have to do the computation anyway (so it
doesn't matter where you do it) so all you have to try and do
is minimise the total amount of message passing.

   Cheers

   /Joe


>
>> -----Original Message-----
>> From: 
>> [mailto:]On Behalf Of Joe Armstrong
>> Sent: den 27 oktober 2004 11:33
>> To: Heinrich Venter
>> Cc: 
>> Subject: Re: How to get configuration data to a large number
>> of threads?
>>
>>
>>
>>    How about some lateral thinking here:
>>
>>> I have a transaction based system that spawns a
>>> thread to handle every incoming transaction.  Unfortunately there is
>>> quite a large chunk of relatively static configuration
>> information that
>>> is needed by every thread....
>>> The question is, how do I get this information to every
>> thread without
>>> significantly slowing things down or using up all the
>> available memory?
>>
>>
>>    You don't - you move the functions to the data - not the
>> data to the
>> functions :-)
>>
>>    Assume the  configuration data is HUGE  (GBytes) - if
>> this case I'd
>> send the computations to  the configuration data. Exactly the opposite
>> of what you did :-)
>>
>>    So you make a global server like this
>>
>>    -module(config).
>>
>>    start() ->
>>  	AHugeDataStructure = read_initiaal_config_data(),
>>  	loop(AHugeDataStructure).
>>
>>
>>    loop(AHugeDataStructure) ->
>>  	receive
>>  	   {upgrade, AnotherHugeDataStructure} ->
>>  		loop(AnotherHugeDataStruce);
>>  	   {compute, From, Fun} ->
>>  		Val = (catch Fun(AHugeDataStructure)),
>>  	        From ! {self(), Val},
>>  	        loop(AHugeDataStructure)
>>  	end.
>>
>> ... now about the clients ...
>>
>>     Suppose the client wants a username and password
>>
>>     Make a query
>>
>>  	Query = fun(Config) ->
>>  		   {get_from_config(username, Config),
>>  		    get_from_config(password, Config)
>>  		end,
>>
>>    Send it to the config process with a
>>
>>  	Config ! {compute, self(), Query}
>>
>>  	and wait for the reply
>>
>>
>>    Note that sending Funs in messages (locally) is a very lightweight
>> operation (as I have explained before :-)
>>
>>    Now you possibly have a bottleneck in the server - how to
>> solve this?
>>
>>    Make several servers with identical copies of the
>> configuration data.
>>
>>    Remember - it might be cheaper to move the functions to the data
>> than moving the data to the functions.
>>
>>    RPC style programming in "Other Languages" (TM) makes you believe
>> that there is only only way of doing RPCs (ie moving the data
>> to the functions)
>>
>>    Erlang offers freedom of choice.
>>
>>    Cheers
>>
>> /Joe
>>
>>
>>
>>
>>
>>
>



More information about the erlang-questions mailing list