[erlang-questions] benchmarks game harsh criticism
Sat Dec 1 02:49:59 CET 2007
Isaac Gouy wrote:
> --- David Hopwood <david.hopwood@REDACTED> wrote:
>>>> This is an elementary error, sufficiently serious that it's not
>>>> enough just for the FAQ to mention it in passing. It
>>>> systematically biases the results against language implementations
>>>> with a significant startup/shutdown time, or other fixed overheads.
>>>> Combined with the fact that most of the benchmarks only run for a few
>>>> seconds, the resulting bias is quite large.
>>> Specifically how large is the resulting bias?
>> Probably about 10% in some cases (for JVM-based implementations and
> Sorry, I haven't figured out a way to make sense of that - 10% of what?
Of some of the benchmark times. What else?
> I'm also a little puzzled that you say "probably about 10% in some
> cases", you claimed there was a serious elementary error and the
> resulting bias is quite large - is that just speculation?
Suppose for the sake of argument that we take the 'startup' benchmark
as a rough estimate of startup/shutdown time. (I don't claim that it
is a good estimate, but it will do for this argument.)
According to the AMD Sempron results, Erlang HiPE takes 0.1992 s for
the startup benchmark on that platform (false precision, but never
mind that). It takes 0.77 s on the pidigits benchmark on the same
platform. So for this benchmark run, around 26% of the time is taken
If this time were not included, the Erlang HiPE entry would move
from 17th to around 12th place (if we assume that the startup/shutdown
times for the entries between those places are not significant,
which is likely to be true in this particular case).
More information about the erlang-questions