How to get configuration data to a large number of threads?
Joe Armstrong
joe@REDACTED
Wed Oct 27 13:54:56 CEST 2004
On Wed, 27 Oct 2004, Ulf Wiger (AL/EAB) wrote:
>
> An example of Joe's suggestion of moving the functions to
> the data can be found in the gen_leader contribution
> (at sourceforge). An example program using gen_leader is
> gdict -- a fully replicated version of dict. It does
> exactly what Joe proposes: broadcasts a fun which performs
> the same operation on all copies of the replicated dictionary.
>
> /Uffe
There are two alternatives
1) send the config data to everybody who needs it
2) send functions to a program that has a copy of the
config data. Evaluate these functions locally and
return the values.
If 2 overloads the config server then do as follows:
Make one registered process, that dispatches work to N worker
processes.
Like this (not tested)
config() ->
C = initial_data()
Pids = launch(10, C),
loop(Pids, Pids).
launch(0) -> true;
launch(N, C) ->
spawn(fun() -> worker(C) end),
launch(N-1, C).
The master handles reconfigure and compute requests
loop([], Pids) ->
receive
{done, Pid} ->
loop([Pid], Pids);
{reconfig, C1} ->
foreach(fun(I) -> Pid ! {reconfig, C1} end, Pids),
loop([], Pids)
end;
loop(X=[H|T], Pids) ->
receive
{done, Pid} ->
loop([Pid|X], Pids);
{reconfig, C1} ->
foreach(fun(I) -> Pid ! {reconfig, C1} end, Pids),
loop(X, Pids);
{compute, From Q} ->
H ! {compute, self(), F, Q},
loop(T, Pids)
end.
The workers do the work
worker(Config) ->
receive
{reconfig, C1} ->
worker(C1);
{compute, Master, From, F} ->
From ! {Master, (catch F(Config))},
Master ! {done, self()},
worker(Config)
end.
I suspect that this is the quickest method. I guess sending
functions to the data is probably quicker than sending data to the
functions - BUT - you have to make sure that the functions do not bind
too much of the local contexts (ie they should use as few local
variables from their environment as possible)
If you think about it, you have to do the computation anyway (so it
doesn't matter where you do it) so all you have to try and do
is minimise the total amount of message passing.
Cheers
/Joe
>
>> -----Original Message-----
>> From: owner-erlang-questions@REDACTED
>> [mailto:owner-erlang-questions@REDACTED]On Behalf Of Joe Armstrong
>> Sent: den 27 oktober 2004 11:33
>> To: Heinrich Venter
>> Cc: erlang-questions@REDACTED
>> Subject: Re: How to get configuration data to a large number
>> of threads?
>>
>>
>>
>> How about some lateral thinking here:
>>
>>> I have a transaction based system that spawns a
>>> thread to handle every incoming transaction. Unfortunately there is
>>> quite a large chunk of relatively static configuration
>> information that
>>> is needed by every thread....
>>> The question is, how do I get this information to every
>> thread without
>>> significantly slowing things down or using up all the
>> available memory?
>>
>>
>> You don't - you move the functions to the data - not the
>> data to the
>> functions :-)
>>
>> Assume the configuration data is HUGE (GBytes) - if
>> this case I'd
>> send the computations to the configuration data. Exactly the opposite
>> of what you did :-)
>>
>> So you make a global server like this
>>
>> -module(config).
>>
>> start() ->
>> AHugeDataStructure = read_initiaal_config_data(),
>> loop(AHugeDataStructure).
>>
>>
>> loop(AHugeDataStructure) ->
>> receive
>> {upgrade, AnotherHugeDataStructure} ->
>> loop(AnotherHugeDataStruce);
>> {compute, From, Fun} ->
>> Val = (catch Fun(AHugeDataStructure)),
>> From ! {self(), Val},
>> loop(AHugeDataStructure)
>> end.
>>
>> ... now about the clients ...
>>
>> Suppose the client wants a username and password
>>
>> Make a query
>>
>> Query = fun(Config) ->
>> {get_from_config(username, Config),
>> get_from_config(password, Config)
>> end,
>>
>> Send it to the config process with a
>>
>> Config ! {compute, self(), Query}
>>
>> and wait for the reply
>>
>>
>> Note that sending Funs in messages (locally) is a very lightweight
>> operation (as I have explained before :-)
>>
>> Now you possibly have a bottleneck in the server - how to
>> solve this?
>>
>> Make several servers with identical copies of the
>> configuration data.
>>
>> Remember - it might be cheaper to move the functions to the data
>> than moving the data to the functions.
>>
>> RPC style programming in "Other Languages" (TM) makes you believe
>> that there is only only way of doing RPCs (ie moving the data
>> to the functions)
>>
>> Erlang offers freedom of choice.
>>
>> Cheers
>>
>> /Joe
>>
>>
>>
>>
>>
>>
>
More information about the erlang-questions
mailing list