[erlang-questions] lists:pmap ?

Tue Aug 22 10:58:47 CEST 2017

On 2017年08月22日 火曜日 08:42:01 Ola Andersson A wrote:
> Reading the discussion about list:mapfind reminded me of a function I rediscovered recently.
> It's rpc:pmap/3 that was originally defined for use in a distributed environment, spreading out the processing over several nodes. It actually works on a single multicore node as well even though it wasn't designed for that purpose.
> With the limited tests I have done it seems to significantly outperform lists:map/2 and also scale reasonably well. The almost negligible cost of spawning erlang processes is still amazing.
> How about adding a lists:pmap/2 function, designed for multicore, in the lists module?

I've implemented something along these lines several times in an ad hoc manner for pure map functions in various projects, especially client-side code (but the function being mapped really needs to be pure!). I imagine that's fairly common (in client code not doing this is sometimes crazy).

It WOULD be pretty awesome to have this in the standard library, and I can easily imagine a version of that which would be written against other collection-type data structures as well...

But I also imagine that the arbitrariness of the input would suddenly make an implementation become non-trivial to go about in a safe way (or not at least without splashing the documentation liberally with warnings that pmap might blow your node up, that mapped functions really need to be pure, etc.). Do we limit worker spawns to a finite total size based on the VM's condition and keep up with it as return values are reported back? "The VM's condition" is a special ball of madness right there. Etc.

This could be picked through, and probably quite effectively if some time is put to it. I just want to point out that the implementation will either be super naive (but still useful in cases where people know what they are dealing with), or super involved for what is conceptually a very simple idea -- but probably not very much happy middle exists between those two extremes.

-Craig