Multi-core Erlang

Thu Mar 9 15:22:33 CET 2006

Hello list,

    Following Rickards post I have now got my hands on a dual core dual
processor
system - (ie 4 CPUs) and we have been able to reproduce Richards
results.

    I have posted a longer article (with images) on my newly started
blog

    http://www.erlang-stuff.net/blog/

    This shows a 3.6 factor speedup for a message passing benchmark and
1.8 for an application program (a SIP stack) - these are good results.
The
second program in particular was not written to avoid sequential
bottlenecks.

     Despite this it ran 1.8 times faster on a 4 CPU system than on a
one CPU system.

     The nice thing about these results were that the benchmark ran
almost 4 times faster
- this benchmark just did spawns message passing and computations and
had no sequential
bottlenecks - pure code made from lots of small processes seems to speed
up nicely on 
a multi-core system.

     Well done Rickard

     So how do you make stuff that goes fast? - go parallel

Cheers

/Joe

> -----Original Message-----
> From: owner-erlang-questions@REDACTED 
> [mailto:owner-erlang-questions@REDACTED] On Behalf Of Rickard Green
> Sent: den 7 mars 2006 17:52
> To: erlang-questions@REDACTED
> Subject: [Fwd: Message passing benchmark on smp emulator]
> 
> Trying again...
> 
> -------- Original Message --------
> Subject: Message passing benchmark on smp emulator
> Date: Tue, 07 Mar 2006 17:30:40 +0100
> From: Rickard Green <rickard.s.green@REDACTED>
> Newsgroups: erix.mailing-list.erlang-questions
> 
> The message passing benchmark used in estone (and bstone) 
> isn't very well suited for the smp emulator since it sends a 
> message in a ring (more or less only 1 process runnable all the time).
> 
> In order to be able to take advantage of an smp emulator I 
> wrote another message passing benchmark. In this benchmark 
> all participating processes sends a message to all processes 
> and waits for replies on the sent messages.
> 
> I've attached the benchmark. Run like this:
> big:bang(NoOfParticipatingProcesses).
> 
> I ran the benchmark on a machine with two hyperthreaded Xeon 
> 2.40GHz processors.
> 
> big:bang(50):
> * r10b completed after about 0.014 seconds.
> * p11b with 4 schedulers completed after about 0.018 seconds.
> 
> big:bang(100):
> * r10b completed after about 0.088 seconds.
> * p11b with 4 schedulers completed after about 0.088 seconds.
> 
> big:bang(300):
> * r10b completed after about 2.6 seconds.
> * p11b with 4 schedulers completed after about 1.0 seconds.
> 
> big:bang(500):
> * r10b completed after about 10.7 seconds.
> * p11b with 4 schedulers completed after about 3.5 seconds.
> 
> big:bang(600):
> * r10b completed after about 18.0 seconds.
> * p11b with 4 schedulers completed after about 5.8 seconds.
> 
> big:bang(700):
> * r10b completed after about 27.0 seconds.
> * p11b with 4 schedulers completed after about 9.3 seconds.
> 
> Quite a good result I guess.
> 
> Note that this is a special case and these kind of speedups 
> are not expected for an arbitrary Erlang program.
> 
> If you want to try yourself download a P11B snapshot at:
> http://www.erlang.org/download/snapshots/
> remember to enable smp support:
> ./configure --enable-smp-support --disable-lock-checking
> 
> You can change the number of schedulers used by passing the
> +S<NO_OF_SCHEDULERS> command line argument to erl or by calling:
> erlang:system_flag(schedulers, NoOfSchedulers) -> 
> {ok|PosixError, CurrentNo, OldNo}
> 
> /Rickard Green, Erlang/OTP
> 
> 
>