Distributing things

Sun Nov 8 02:59:00 CET 2009

On Sat, Nov 07, 2009 at 09:02:39PM +0000, Calum wrote:
> On Sat, Nov 7, 2009 at 8:55 PM, Fabio Mazanatti
> <fabio.mazanatti@REDACTED> wrote:
> > Hi,
> >
> > the scenario I imagined for this exercise, based on Calum's text would be
> > something like assigning a range of numbers to each node, starting with the
> > lowest possible given by the entry parameter (the # of digits), controlling
> > its distribution and wrapping things up when a prime number is found by one
> > of the nodes.
> >
> > An example: 4 nodes, generating a prime with 6 digits, would start at
> > 100.000. The entry point would assign a range from 100.000 to 105.000 to
> > node 1, 105.001 to 110.000 to node 2, and so on. When a node finishes
> > traversing a range, a new block is assigned to it, until one of them returns
> > a valid value, and processing stops.
> 
> Yes, this is the sort of thing I would do for this problem.
> However, for other problems, perhaps the splitting method isn't so clearcut.

So for these other problems, you would use a different method of
distribution.  For this problem, chunking the search space and distributing
it to clients is a good method, so use it.

"Right tool for the job" and all that.

> Also, your method only really allows for 20 nodes - what would you do
> if you suddenly had 50 nodes available to join in?

Since the controlling process keeps control of the chunks, if more nodes
join in mid-job, they just ask the controller for the next available chunk
and go to it.

> Actually, in the real world, do Erlang node clusters vary in size a
> lot during operation, or do most people know how many nodes they have
> available at the start, and just program for that?

Honestly, I don't think that Erlang is the tool of choice for many big iron
distributed computation problems.  That's not to say that I don't think it's
a good option (especially if you have more highly-optimised native code
doing the number crunching, for maximum speed) but just that it isn't as
widely used as the other, more well-known methods.

> Is my attempting to cope with varied node numbers something that just
> isn't really needed?

It's your problem, not ours -- if it's necessary to solve your problem, then
it's needed.  Personally, I'd try the simplest thing that could possibly
work first, which would be spawning a fixed number of processes to do the
crunching, but if that was too slow in benchmarking (run a subset of the
problem, time it, extrapolate) and, say, I wanted to start the search now
but I had a pile of new hardware coming online next month, then I'd make
sure that I could increase the number of available workers dynamically.

- Matt