[erlang-questions] ANN: gproc_pool + some performance tidbits

pablo platt pablo.platt@REDACTED
Tue Jun 4 21:30:47 CEST 2013


What's the use case for workers in the pool?
Is it only for distributing a task or also for implementing a pool of DB
connections like https://github.com/devinus/poolboy ?

Why workers has names?
I know I can just give them names such as 0,1,2... but trying to understand
the rational.

As always, I'm sure this functionality will be a major part in my server
like everything else in gproc,
even if I still don't know why ;)

Thanks




On Tue, Jun 4, 2013 at 10:24 PM, Ulf Wiger <ulf@REDACTED> wrote:

>
> On 4 Jun 2013, at 18:52, ANTHONY MOLINARO wrote:
>
> Hi Ulf,
>
> Have you done any concurrent tests?  I only ask because I've seen our own
> pooling code (https://github.com/openx/gen_server_pool) have issues under
> load.  Now in our case
> it's because of a single gen_server acting as a dispatch layer, which
> should not be the
> case for gproc as IIRC it uses ets to provide for fast concurrent access
> (something also
> done in a novel way by https://github.com/ferd/dispcount/ which I keep
> meaning to try
> out), but I'd be curious to know if you've done any concurrent testing
> which shows that.
>
>
> I hadn't, but did so now.
>
> Spawning N clients, which run 1000 iterations each, on e.g. a round_robin
> pool:
>
> N   Avg usec/iteration
> 1                37
> 10           250
> 100       1630
> 1000  18813
>
> Of course, this was a pretty nasty test, with all processes banging away
> at the pool as fast as they possibly could. If you want frequent mutex
> conflicts, that's probably as good a way as any to provoke them.
>
> When I insert a random sleep (0-50 ms) between each iteration, time each
> pick request and collect the averages, 100 concurrent workers pay on
> average 50 usec per selection. For 1000 concurrent workers, the average
> rises to 60 usec.
>
> The corresponding average for the hash pool and 1000 concurrent workers is
> 20 usec.
>
> (All on my Macbook Air)
>
>
> I think the number of pool implementations in erlang has probably finally
> surpassed
> the number of json parsers ;)
>
>
> Well, that tends to happen with fun and reasonably well-bounded problems.
> ;)
>
> BR,
> Ulf W
>
>
> -Anthony
>
> On Jun 4, 2013, at 2:18 AM, Ulf Wiger <ulf@REDACTED> wrote:
>
>
> I pushed a new gproc component called gproc_pool the other day.
>
> The main idea, apart from wanting to see how well it would work, was that
> I wanted to be able to register servers with gproc and then have an
> efficient way of pooling between them. A benefit of using gproc throughout
> is that the registration objects serve as a 'footprint' for each process -
> by listing the gproc entities for each process, you can tell a lot about
> its purpose.
>
> The way gproc_pool works is that:
> 1. You define a pool, by naming it, and optionally specifying its size
>     (gproc_pool:new(Pool) | gproc_pool:new(Pool, Type, Options))
> 2. You add worker names to the pool
>    (gproc_pool:add_worker(Pool, Name))
> 3. Your servers each connect to a given name
>    (gproc_pool:connect_worker(Pool, Name))
> 4. Users pick a worker for each request (gproc_pool:pick(Pool))
>
> My little test code indicates that the different load-balancing strategies
> perform a bit differently:
>
> (https://github.com/uwiger/gproc/blob/master/src/gproc_pool.erl#L843)
>
> (Create a pool, add 6 workers and iterate 100k times,
> incrementing a gproc counter for each iteration.)
>
> 3> gproc_pool:test(100000,round_robin,[]).
> worker stats (848):
> [{a,16667},{b,16667},{c,16667},{d,16667},{e,16666},{f,16666}]
> {2801884,ok}
> 4> gproc_pool:test(100000,hash,[]).
> worker stats (848):
> [{a,16744},{b,16716},{c,16548},{d,16594},{e,16749},{f,16649}]
> {1891517,ok}
> 5> gproc_pool:test(100000,random,[]).
> worker stats (848):
> [{a,16565},{b,16542},{c,16613},{d,16872},{e,16727},{f,16681}]
> {3701011,ok}
> 6> gproc_pool:test(100000,direct,[]).
> worker stats (848):
> [{a,16667},{b,16667},{c,16667},{d,16667},{e,16666},{f,16666}]
> {1766639,ok}
> 11> gproc_pool:test(100000,claim,[]).
> worker stats (848):
> [{a,100000},{b,0},{c,0},{d,0},{e,0},{f,0}]
> {7569425,ok}
>
>
> The worker stats show how evenly the workers were selected,
> and the {Time, ok} comes from timer:tc/3, i.e. Time/100000 is the
> per-iteration cost:
>
> round_robin: 28 usec (maintain a 'current' counter, modulo Size)
> hash:  19 usec (gproc_pool:pick(Pool, Val), hash on Val)
> random: 37 usec (pick a random worker, using crypto:rand_uniform/2)
> direct: 18 usec (gproc_pool:pick(Pool, N), where N modulo Size selects
> worker)
> claim: 76 usec (claim the first available worker, apply a fun, then
> release)
>
> I think the per-selection cost is acceptable as-is, but could perhaps be
> improved (esp. the 'random' strategy is surprisingly expensive). All the
> selection work is done in the caller's process, BTW - no communication with
> the gproc or gproc_pool servers (except for admin tasks).
>
> The 'claim' strategy is also surprisingly expensive. I believe it's
> because I'm using gproc:select/3 to find the first free worker. Note also
> that it results in an extremely uneven distribution. That's obviously
> because the test run claims the first available worker and then releases it
> before iterating - it's always going to select the first worker.)
>
> https://github.com/uwiger/gproc/blob/master/doc/gproc_pool.md
>
> Feedback welcome, be it with performance tips, usability tips, or other.
>
> BR,
> Ulf W
>
> Ulf Wiger, Co-founder & Developer Advocate, Feuerlabs Inc.
> http://feuerlabs.com
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
>
> Ulf Wiger, Co-founder & Developer Advocate, Feuerlabs Inc.
> http://feuerlabs.com
>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130604/90da432f/attachment.htm>


More information about the erlang-questions mailing list