[erlang-questions] Classification of concurrency patterns
Vlad Dumitrescu
vladdu55@REDACTED
Wed May 12 12:35:09 CEST 2010
Hello Joe,
On Wed, May 12, 2010 at 09:31, Joe Armstrong <erlang@REDACTED> wrote:
> I'm interested in trying to classify and name a number of different
> parallelization strategies.
> 1) Divide and conquer concurrency
> 2) Pipeline concurrency
> 3) Map reduce concurrency
> 4) Identical Job concurrency
> 5) Grid concurrency
> 6) Job queue concurrency
IMHO, #1, #3 and #4 are the same thing, differently parametrized. Do
you think it is important to get them separate identities?
Regarding #5, a more general description would involve a grid that is
not restricted to two dimensions. I think I remember seeing papers
about 3D grids.
#6 can be seen as a generalization of #2, where in the pipeline the
"next" worker is hardcoded and the structure is linear. A pipeline may
or may not use buffering queues between stages.
The way I see it, there are two main categories:
* dividing the data set into smaller chunks and processing them in
parallel (#1,3,4)
* dividing the work into specialized chunks and letting the data flow
to next chunk (#2,6)
These two can also be combined in several ways, giving for example #5
(usually each processor has its own chunk of input data, but data can
flow to neighboring processors). Or each processing unit in #1 could
involve a pipeline.
It depends on the purpose of defining these categories whether a
shorter, more general list or a more detailed one is required.
best regards,
Vlad
> 1) Divide and conquer concurrency
>
> Divide the problem into K jobs which can be performed in parallel.
> Perform the jobs
> Recombine the results
>
> 2) Pipeline concurrency
>
> Divide the problem into K jobs that can be performed in a pipeline.
> The output of the first job must be the input of the second and so on.
> Run all the steps in parallel
>
> 3) Map reduce concurrency
>
> This is similar to 1). Only the recombination step involves
> merging results that
> have the same Key.
>
> 4) Identical Job concurrency
>
> Here the problem does not have to be split into K parts, we
> already have N identical
> jobs to do (identical means jobs that will consume similar resources)
>
> There is no recombination step. There are N results.
>
> 5) Grid concurrency
>
> Divide the problem into N*M identical jobs that can be solved on a
> NxM grid of processors.
> Each processor can only talk to its direct neighbors
>
> 6) Job queue concurrency
>
> Divide the problem into a set of named workers (who do jobs). Each
> worker has an input
> job queue. Workers consume jobs from their job queues and send the
> result to other workers.
>
> Note: [1] Paul Morrison calls this flow based programming
More information about the erlang-questions
mailing list