Robert,<div><br></div><div>In my ppool implementation I re-create them, I don't recycle. I call it a "pool" because instead of breaking off "hunks" (sublists) then spawning x processes for each item in the sublist - if one process takes longer than the others, those resources sit idle till that one is done, then the subdivision process starts again etc...</div>
<div><br></div><div>The "pool" implementation lets me delegate in a round robin fashion, I don't recycle, I create new processes as other processes finish to keep at most x number of processes working until the job is done.</div>
<div><br></div><div>pmap in my use case would be "bad". Very bad. 3000 items in a list with about 1400 of those items making > 20 HTTP requests (the rest doing about 3 to 4 requests) would completely tank the machine and would also be irresponsible crawling. But, I *do* want to do the work in parallel - just not at that scale; so using a process pool strategy I limit the number of concurrent crawl workers to about 5 or 6; which is effective on the machine.</div>
<div><br><div class="gmail_quote">On Fri, Jul 22, 2011 at 5:10 PM, Robert Virding <span dir="ltr"><<a href="mailto:robert.virding@erlang-solutions.com">robert.virding@erlang-solutions.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div><div style="font-family:Times New Roman;font-size:12pt;color:#000000">But a pmap SHOULD start processes for all the elements in the list in parallel. It is after all a 'P'map. In which case all the processes will be running and processing in parallel as you want. The only reason I can see for using a worker pool is if you actually want to LIMIT the number of processes running at the same time.<br>
<br>IMAO in Erlang there are only two reasons for using worker/process pools:<br><br>- you want/need to limit the number of "things" running in parallel<br>- you actually do want to reuse a process for another computation, there is something in the application which mandates reusing processes.<br>
<br>Otherwise it is just extra work, process creation/termination is so fast that there is no real gain in keeping them around to reuse.<div><div></div><div class="h5"><br><br>Robert<br><br>----- "Parnell Springmeyer" <<a href="mailto:ixmatus@gmail.com" target="_blank">ixmatus@gmail.com</a>> wro
te:
<br>> Because the list has about 3000 items in it, and for each item about 20-50 HTTP requests are made; I needed a way of parallelizing the operations (instead of stepping through the list one by one) but in a controlled fashion and using a round robin strategy (worker pool).<div>
<br>> </div><div><div><div class="gmail_quote">> On Fri, Jul 22, 2011 at 6:10 AM, David Mercer <span dir="ltr"><<a href="mailto:dmercer@gmail.com" target="_blank">dmercer@gmail.com</a>></span> wrote:<br>> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I was curious about that, too. Hoping you'll get a response...<br>>
<br>>
> -----Original Message-----<br>>
> From: <a href="mailto:erlang-questions-bounces@erlang.org" target="_blank">erlang-questions-bounces@erlang.org</a> [mailto:<a href="mailto:erlang-questions-" target="_blank">erlang-questions-</a><br>>
> <a href="mailto:bounces@erlang.org" target="_blank">bounces@erlang.org</a>] On Behalf Of Robert Virding<br>>
> Sent: Wednesday, July 20, 2011 8:42 PM<br>>
> To: Parnell Springmeyer<br>>
> Cc: erlang-questions<br>>
> Subject: Re: [erlang-questions] Process pool map/3 implementation<br>>
><br>>
> One quick question: what was wrong with the straightforward solution of<br>>
> just spawning one process for each element in the list? Did this break<br>>
> or do you actually need more control?<br>>
><br>>
> Robert<br>>
<div><div></div><div>> ><br>>
> ----- "Parnell Springmeyer" <<a href="mailto:ixmatus@gmail.com" target="_blank">ixmatus@gmail.com</a>> wrote:<br>>
><br>>
> > -----BEGIN PGP SIGNED MESSAGE-----<br>>
> > Hash: SHA1<br>>
> ><br>>
> > For a work project I have a large list (thousands of items) to<br>>
> > process<br>>
> > and at first built a "pmap" implementation as per Joe's book until I<br>>
> > found the plists module (which is awesome btw).<br>>
> ><br>>
> > There is one glaring issue with the list -> subdivide -> spawn x<br>>
> > processes for n sublist items strategy; if an item in the sublist<br>>
> > takes<br>>
> > longer than all the other items it blocks the entire resource<br>>
> > allotment<br>>
> > until it is done.<br>>
> ><br>>
> > In most cases, the plists/pmap implementation works just fine because<br>>
> > the items in the list probably don't take more than a few<br>>
> > milliseconds<br>>
> > to map the fun over. However, it does become an issue when that is<br>>
> > not<br>>
> > the case.<br>>
> ><br>>
> > So, I figured the next best strategy would be to implement a process<br>>
> > pool since it would allow for slow running processes to continue<br>>
> > their<br>>
> > work while finished processes can die and new processes spawned into<br>>
> > the<br>>
> > pool ready for work - so none of the resources are sitting idle.<br>>
> ><br>>
> > Right now, my module isn't nearly as feature-complete as the plists<br>>
> > module is - this is only a drop in replacement for map. Please submit<br>>
> > your criticisms and comments to me at this address.<br>>
> ><br>>
> > You may find the code on BitBucket:<br>>
> > <a href="https://bitbucket.org/ixmatus/ppool" target="_blank">https://bitbucket.org/ixmatus/ppool</a><br>>
> ><br>>
> > - --<br>>
> > Parnell "ixmatus" Springmeyer (<a href="http://ixmat.us" target="_blank">http://ixmat.us</a>)<br>>
> > -----BEGIN PGP SIGNATURE-----<br>>
> > Version: GnuPG/MacGPG2 v2.0.17 (Darwin)<br>>
> > Comment: GPGTools - <a href="http://gpgtools.org" target="_blank">http://gpgtools.org</a><br>>
> ><br>>
> > iQEcBAEBAgAGBQJOJmKwAAoJEPvtlbpI1POL+asIAKPcR0SOw67hFwwIbmkf89sS<br>>
> > 4+Zx9hx1V/+86OVtXcqcOY+yxNcHezNEKkw8z2XHmDAWbeOl3bbINFySRXbQVydV<br>>
> > 854lArqCHRG+ZlJ6ZrgecXKf9mG8ldbK1InwEZWOVZBj63rhmloMaGiyTzmxA88S<br>>
> > 7mDNS4uhhpvRT2znpnsWt1x12IAzeayV0hf5/BLjp+b5FMZPc9oSa4n5uzyA9AVW<br>>
> > +av6hyuFfK32lhxUb4u3bVMaHOf2n/YwJexS25+NODcpkI3BLXNkrmKwgz8Lv/sA<br>>
> > omKzKTiuhpa0vTM+TLI9pn82GCJLdD+ON9DDOFN4ww+BnmXjhykiicBQCg7yhtQ=<br>>
> > =GP7K<br>>
> > -----END PGP SIGNATURE-----<br>>
</div></div><div><div></div><div>> > > _______________________________________________<br>>
> > erlang-questions mailing list<br>>
> > <a href="mailto:erlang-questions@erlang.org" target="_blank">erlang-questions@erlang.org</a><br>>
> > <a href="http://erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br>>
> _______________________________________________<br>>
> erlang-questions mailing list<br>>
> <a href="mailto:erlang-questions@erlang.org" target="_blank">erlang-questions@erlang.org</a><br>>
> <a href="http://erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br>>
<br>>
</div></div></blockquote></div><br>> <br clear="all"><br>> -- <br>> Parnell "ixmatus" Springmeyer (<a href="http://ixmat.us" target="_blank">http://ixmat.us</a>)<br>>
</div></div>
<br></div></div>> _______________________________________________
erlang-questions mailing list
<a href="mailto:erlang-questions@erlang.org" target="_blank">erlang-questions@erlang.org</a>
<a href="http://erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>Parnell "ixmatus" Springmeyer (<a href="http://ixmat.us" target="_blank">http://ixmat.us</a>)<br>
</div>