Mon Sep 10 14:41:29 CEST 2012
>I started playing with a simple implementation of MD5 (basically straight from the wiki page)
>When the implementation produce the correct output, I did some timing for fun and also
>tried to native compile it.
>To my astonishment I discovered that the code was slower when native compiled.
>I am using R15B01 on mac. Can this be true? And in this case why?
>Note that I do not care to optimize the implementation it self! I am more interested why
>the native compiler produce slower code on this example.
On my Mac, the best native result 7% is faster than the best beam result. But the timing varies by 20% from run to run, so I think the proper claim is they are about the same. (I also think Hipe ought to comfortably beat Beam on this sort of code.)
Unfortunately, 'pp_native' gave broken/confusing output so it's hard to trace what happens. The arithmetic looks like it's being inlined, which ought to provide a boost. There are calls to r/1 and rotate/2 in the inner loop (md5_/6) which possibly should have been inlined.
More information about the erlang-questions