[erlang-questions] Classification of concurrency patterns

Wed May 12 21:34:17 CEST 2010

I would certainly appreciate a single reference on high-level
distributed/parallel design patterns. I can find some of the patterns
you list in a handful of books and papers, but not all of them in one
place. Over the past year or so, I've searched for a compiled list
like you're suggesting, and I haven't turned up much. So even if such
a thing does exist, I'm probably not the only one who hasn't found it
yet.

While having implementations of these patterns would be great, I'd
just like to add that if a list of patterns must be compiled to begin
with, that list would be of great benefit to a broader audience than
just the erlang community. I'm thinking in particular of a GoF-style
of pattern analysis wherein you offer lists of common uses, strengths,
and drawbacks for each pattern. A work like that on high-level
parallel/distributed design patterns would be invaluable.

@Vlad: While I empathize with you, I do think the distinctions are
important. Taking myself as an example, I can still look at most of
the GoF patterns and argue that they're each a special case of the
Strategy Pattern (though I've learned to stop arguing about this). To
people that work with these patterns on a daily basis, the nuances
between the patterns are significant. So even though I don't yet grasp
many of the nuances myself, I can appreciate that other people do. So
long as there are significant differences between appropriate uses,
strengths, and pitfalls for a pattern and its variation, I'd argue
that the variation probably deserves its own entry in the list. Even
if they all still look like divide and conquer to me :) .

On Wed, May 12, 2010 at 3:35 AM, Vlad Dumitrescu <vladdu55@REDACTED> wrote:
> Hello Joe,
>
> On Wed, May 12, 2010 at 09:31, Joe Armstrong <erlang@REDACTED> wrote:
>> I'm interested in trying to classify and name a number of different
>> parallelization strategies.
>
>> 1) Divide and conquer concurrency
>> 2) Pipeline concurrency
>> 3) Map reduce concurrency
>> 4) Identical Job concurrency
>> 5) Grid concurrency
>> 6) Job queue concurrency
>
> IMHO, #1, #3 and #4 are the same thing, differently parametrized. Do
> you think it is important to get them separate identities?
>
> Regarding #5, a more general description would involve a grid that is
> not restricted to two dimensions. I think I remember seeing papers
> about 3D grids.
>
> #6 can be seen as a generalization of #2, where in the pipeline the
> "next" worker is hardcoded and the structure is linear. A pipeline may
> or may not use buffering queues between stages.
>
> The way I see it, there are two main categories:
> * dividing the data set into smaller chunks and processing them in
> parallel (#1,3,4)
> * dividing the work into specialized chunks and letting the data flow
> to next chunk (#2,6)
>
> These two can also be combined in several ways, giving for example #5
> (usually each processor has its own chunk of input data, but data can
> flow to neighboring processors). Or each processing unit in #1 could
> involve a pipeline.
>
> It depends on the purpose of defining these categories whether a
> shorter, more general list or a more detailed one is required.
>
> best regards,
> Vlad
>
>> 1) Divide and conquer concurrency
>>
>> Divide the problem into K jobs which can be performed in parallel.
>> Perform the jobs
>> Recombine the results
>>
>> 2) Pipeline concurrency
>>
>> Divide the problem into K jobs that can be performed in a pipeline.
>> The output of the first job must be the input of the second and so on.
>> Run all the steps in parallel
>>
>> 3) Map reduce concurrency
>>
>>    This is similar to 1). Only the recombination step involves
>> merging results that
>>   have the same Key.
>>
>> 4) Identical Job concurrency
>>
>>    Here the problem does not have to be split into K parts, we
>> already have N identical
>>    jobs to do (identical means jobs that will consume similar resources)
>>
>>   There is no recombination step. There are N results.
>>
>> 5) Grid concurrency
>>
>>    Divide the problem into N*M identical jobs that can be solved on a
>> NxM grid of processors.
>>    Each processor can only talk to its direct neighbors
>>
>> 6) Job queue concurrency
>>
>>    Divide the problem into a set of named workers (who do jobs). Each
>> worker has an input
>>    job queue. Workers consume jobs from their job queues and send the
>> result to other workers.
>>
>>    Note:  [1] Paul Morrison calls this flow based programming
>
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>
>