[erlang-questions] lists:pmap ?

Joe Armstrong erlang@REDACTED
Tue Aug 22 18:34:44 CEST 2017


Interesting-  I teach a class in concurrent and parallel programming
where I show a simple pmap subject to the condition that all processes
terminate
within a reasonable time and with no errors.

The more fun cases do things like limiting the number of parallel processes
and versions which don't bother about the order of the replies in the
result list. We might want to compute F(X) for all elements X in a
list L, but given
L = [X1,X2,X3]   might not want to compute [F(X1),F(X2),F(X3)] any ordering
might be acceptable (say [F(X2),F(X1),F(X3)].

If we have N cores we might want to limit the number of parallel processes
to say 2N since we can't actually do more that N things in parallel -
or we might not.

You might want to look at this project

http://skel.weebly.com/about-skel.html

Which has a number of algorithmic skeletons for parallelizing erlang programs

Cheers

/Joe


On Tue, Aug 22, 2017 at 10:58 AM, zxq9 <zxq9@REDACTED> wrote:
> On 2017年08月22日 火曜日 08:42:01 Ola Andersson A wrote:
>> Reading the discussion about list:mapfind reminded me of a function I rediscovered recently.
>> It's rpc:pmap/3 that was originally defined for use in a distributed environment, spreading out the processing over several nodes. It actually works on a single multicore node as well even though it wasn't designed for that purpose.
>> With the limited tests I have done it seems to significantly outperform lists:map/2 and also scale reasonably well. The almost negligible cost of spawning erlang processes is still amazing.
>> How about adding a lists:pmap/2 function, designed for multicore, in the lists module?
>
>
> I've implemented something along these lines several times in an ad hoc manner for pure map functions in various projects, especially client-side code (but the function being mapped really needs to be pure!). I imagine that's fairly common (in client code not doing this is sometimes crazy).
>
> It WOULD be pretty awesome to have this in the standard library, and I can easily imagine a version of that which would be written against other collection-type data structures as well...
>
> But I also imagine that the arbitrariness of the input would suddenly make an implementation become non-trivial to go about in a safe way (or not at least without splashing the documentation liberally with warnings that pmap might blow your node up, that mapped functions really need to be pure, etc.). Do we limit worker spawns to a finite total size based on the VM's condition and keep up with it as return values are reported back? "The VM's condition" is a special ball of madness right there. Etc.
>
> This could be picked through, and probably quite effectively if some time is put to it. I just want to point out that the implementation will either be super naive (but still useful in cases where people know what they are dealing with), or super involved for what is conceptually a very simple idea -- but probably not very much happy middle exists between those two extremes.
>
> -Craig
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions



More information about the erlang-questions mailing list