[erlang-questions] OTP process startup vs spawn overhead

Thu Jul 28 14:25:40 CEST 2011

On 07/28/2011 02:04 PM, Heinrich Venter wrote:
 > I was executing the tests from the shell yes, but if I do the
 > following with a compiled function
 >
 > [ timer:tc(fun() ->  [ Pid ! ok || Pid<- [spawn(fun() ->
 > test_gen:loop() end) || _<- lists:seq(1,10000)] ] end) || _<-
 > lists:seq(1,10)].
 >
> I still get the same slow performance.  It must be related to creation
> of the fun then.
>
> This gets better results
> [ timer:tc(fun() ->  [ Pid ! ok || Pid<- [spawn(test_gen, loop, []) ||
> _<- lists:seq(1,10000)] ] end) || _<- lists:seq(1,10)].
>
> It also shows a bout 45% performance penalty for spawning OTP
> processes like mad instead of using spawn(Module, Function, Args).
> On the other hand using spawn(Fun) seems to be 4 times slower.  This
> is more or less in line with what the efficiency guide says.
>
> It pays to measure :)
>
> Moral of the story: Spawn your workers with spawn(M,F,A) or use OTP
> processes if you can afford the slight performance hit and gain a lot
> of convenience.

It also pays to know *what* you measure. In both variant above, your 
code executes the call timer:tc(fun () -> ...end) ten times. The 
fun-object is created and passed to timer:tc/1 before the timing starts, 
so its construction is not part of the measurement. So far so good.

However, within each measurement you call spawn() 10 000 times, and in 
your first example, for each of these times you create a fresh 
fun-object `fun() -> test_gen:loop() end' despite the fact that this 
subexpression is actually constant and could be lifted out of the loop, 
like this:

   F = fun() -> test_gen:loop() end,
   [ timer:tc(fun() ->
               [ Pid ! ok
                || Pid <- [spawn(F) || _<-lists:seq(1,10000)] ]
              end)
    || _<-lists:seq(1,10)].

In your version, you pay the cost both of allocating and instantiating 
10 000 fun-objects as well as of garbage collecting these objects (since 
they immediately become garbage).

The moral should be: if you have a tight loop and really want to cut 
down on the amount of time spent on each iteration - see first if you 
can move out some code that doesn't actually have to be computed within 
the loop in the first place - creating a temporary tuple in the middle 
of the loop should have about the same overhead as creating a fun.

     /Richard