[erlang-questions] Was there any Erlang in the heated benchmark discussion?
Bengt Kleberg
bengt.kleberg@REDACTED
Wed Dec 19 08:23:42 CET 2007
Some people where unfortunate enough to read the heated discussion part
of the harsh benchmark criticism thread (even though I wrote "do not
read this"). Those who read it anyway might wonder if there was any
Erlang connection.
Short answer: Yes, indirectly.
Long answer: Yes, because some shootout tests would not allow Erlangs
good sides to be seen, thanks to an artificial cap on the input. To show
how Erlang might benefit from the idea to increase the limit
individually for each language let us create a test:
The benchmark test T consists of counting items. The number of items to
count is given as an argument N. We have 0.1 second granularity in the
timing. After 2 minutes we assume that the test is hanging and kill it.
We have two languages, M(ainstream) and O(dd). M takes 0 seconds to
start and counts 1 item in 1 millisecond. Language O takes 1 second to
start and counts 1 item in 1 millisecond.
M can count 1024 items before crashing. O can count 1,048,576.
If we choose to use a limited set of N we get the following:
10 100 1000
M 0.0 0.1 1.0
O 1.0 1.1 2.0
In the shootout it is not permitted to increase the fixed limit to
something that M can not handle.
If we stop using a fixed set of values, and instead let N increase until
exhaustion/crash and then stop (as per my suggestion) we get:
10 100 1000 10000 100000 1000000
M 0.0 0.1 1.0 crashed
O 1.0 1.1 2.0 11 101 killed
The shootout can still use the result from 1000 in the comparison table,
but in the graphs we get better information about M and O.
This might sound like a silly test. However, there was a create process
test in the shootout. Some mainstream languages could only handle less
than ten thousand processes. Erlang could do better, but N was limited
to give the mainstream languages a chance to do the test for all values.
This is why I want to allow N to increase until exhaustion for each
language in the shootout, instead of capping N with the same value for
all languages in each test. The method will also make it possible to
avoid the current shootout problem with several languages being very
close at about 1 second because the maximum N is set by a language that
takes a long time for that test.
Do I think that this will stop all attempts to help languages like M to
look better in some test? No. Consider the possibility to change T to
count 1 item N times instead. Or change T to first increment 1 item,
then decrement 1 item for each N. These kind of helpful designs where
present in the shootout to help mainstream languages. When it makes
sense to limit things, it is ok (ex: read a file in chunks).
bengt
--
Those were the days...
EPO guidelines 1978: "If the contribution to the known art resides
solely in a computer program then the subject matter is not
patentable in whatever manner it may be presented in the claims."
More information about the erlang-questions
mailing list