[erlang-questions] On a positive Erlang performance note...

Edwin Fine erlang-questions_efine@REDACTED
Fri Nov 21 00:34:55 CET 2008


On Thu, Nov 20, 2008 at 6:12 PM, Kevin Scaldeferri <kevin@REDACTED>wrote:

> Pretty nice.  I'm curious, how did you decide to parallelize the IO?
>

I didn't parallelize the I/O. I loaded the whole file into a binary and then
did the parallel ops by passing a sub-binary to each process based on offset
and length. To be fair, I was trying to see how well the character
classification routine would fare, so I deliberately left out the I/O. IMHO
parallelizing I/O is a totally different kettle of fish that I'm not ready
to take on right now :)

Ed

>
> -kevin
>
>
> On Nov 20, 2008, at 2:47 PM, Edwin Fine wrote:
>
> I've previously expressed some disappointment in Erlang's worst-case
> scenario SMP performance (lots of messages, very little done with each
> message, communicating parallel SMP processes).
>
> Now, to balance that out, I want to praise Erlang (and HiPE) for a
> virtually linear SMP speedup in a character classification test I ran.
>
> Using sequential and parallelized code (rpc:pmap) and a HiPE-compiled
> module on an Intel Q6600/Ubuntu/8GB/R12B-4, which classified every byte of a
> 40MB file into character classes (e.g. punct, blank), the following results
> were achieved:
>
> Sequential: 18817812 bytes/second
> Parallelized: 74937454 bytes/second
> Speedup: 3.98 (on a 4-core system).
>
> That's extremely close to linear and is pretty impressive. I found HiPE
> gave about a 10x speedup over BEAM. The results are with HiPE and exclude
> the time taken to load the file into a binary.
>
> Kudos to Erlang and HiPE.
>
> I had to do a bit of ugly code to get this level of performance, though
> (using dicts and arrays to keep the counts was just too slow - I fell to
> using a separate parameter for each of the 12 counts).
>
> The outputs follow (ascsp is something I added - counts ASCII SP (32)
> chars, and the purpose of '8bit' is self-evident). The other character
> classes are defined as per http://en.wikipedia.org/wiki/Regular_expression.
>
> Regards,
> Ed
>
> PS In the unlikely event that someone wants the code, I will gladly post it
> if asked.
> ------------
>
> 31> c(charclass,[native]).
> {ok,charclass}
> 32> charclass:bm("/home/efine/erlang/otp_src_R12B-3.tar.gz")
> 32> .
> *** Completed run using classify_binary ***
> File "/home/efine/erlang/otp_src_R12B-3.tar.gz" size is 42195557 bytes
> Classified 42195557 characters
> Breakdown:
> [{'8bit',21296511},
>  {alnum,10079032},
>  {alpha,8454737},
>  {ascsp,158524},
>  {blank,318577},
>  {cntrl,5379355},
>  {digit,1624295},
>  {lower,4269350},
>  {print,15519691},
>  {punct,5282135},
>  {space,963777},
>  {upper,4185387}]
> Speed = 18817812 bytes/sec (0.05314113995461655 us/byte)
> ok
> 33> charclass:bm_par("/home/efine/erlang/otp_src_R12B-3.tar.gz").
> *** Completed run using par_classify_binary ***
> File "/home/efine/erlang/otp_src_R12B-3.tar.gz" size is 42195557 bytes
> Classified 42195557 characters
> Breakdown:
> [{'8bit',21296511},
>  {alnum,10079032},
>  {alpha,8454737},
>  {ascsp,158524},
>  {blank,318577},
>  {cntrl,5379355},
>  {digit,1624295},
>  {lower,4269350},
>  {print,15519691},
>  {punct,5282135},
>  {space,963777},
>  {upper,4185387}]
> Speed = 74937454 bytes/sec (0.013344461835164304 us/byte)
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20081120/cbb8ca19/attachment.htm>


More information about the erlang-questions mailing list