surprising result with hipe compilation
Kostis Sagonas
kostis@REDACTED
Wed Nov 5 17:37:53 CET 2003
Ulf Wiger wrote:
> I called timer:tc(...) several times and picked the fastest one.
>
> >If one repeats the timer:tc call, the runtime for both BEAM and
> >native code is reduced to normal levels, and native code is
> >consistently (for your code) faster than BEAM.
>
> This is not what happens on my machine (a 400 MHz Ultra 10):
> .... DELETED ....
> When compiled with hipe, the code runs significantly slower.
Ulf, I have looked at your program and have trouble obtaining the
behaviour that you are observing. When compiled to native code,
the code is consistently 20-40 % faster than BEAM (and arguably
more than that). There is indeed a variation in the times that
are reported; see below.
What I get here is:
1. ON SPARC
-----------
@hamberg [~/HiPE/tests/uffe] uname -a
SunOS hamberg.it.uu.se 5.9 Generic_112233-08 sun4u sparc SUNW,Ultra-80
@hamberg [~/HiPE/tests/uffe] ~/HiPE/otp/bin/erlc *.erl
@hamberg [~/HiPE/tests/uffe] ~/HiPE/otp/bin/erl
Erlang (BEAM) emulator version 5.4.2003.10.26 [source] [hipe]
Eshell V5.4.2003.10.26 (abort with ^G)
1> lists:sort([element(1,timer:tc(test,run,[])) || _ <- lists:seq(1,15) ]).
[4722,4789,4791,4824,4835,4859,4873,4943,4958,4960,5035,5394,5444,5654,101111]
2> lists:sort([element(1,timer:tc(test,run,[])) || _ <- lists:seq(1,15) ]).
[4594,4699,4699,4761,4785,4816,4840,4847,4899,4900,4904,4982,5068,5569,6389]
3> lists:sort([element(1,timer:tc(test,run,[])) || _ <- lists:seq(1,15) ]).
[4591,4650,4711,4732,4782,4787,4792,4797,4838,4846,4871,4958,4980,5030,5179]
4> halt().
hmberg [~/HiPE/tests/uffe] ~/HiPE/otp/bin/erlc +native *.erl
@hamberg [~/HiPE/tests/uffe] ~/HiPE/otp/bin/erl
Erlang (BEAM) emulator version 5.4.2003.10.26 [source] [hipe]
Eshell V5.4.2003.10.26 (abort with ^G)
1> lists:sort([element(1,timer:tc(test,run,[])) || _ <- lists:seq(1,15) ]).
[3225,3265,3314,3354,3379,3405,3427,3475,3500,3557,3590,3638,3645,3987,295080]
2> lists:sort([element(1,timer:tc(test,run,[])) || _ <- lists:seq(1,15) ]).
[3194,3288,3328,3432,3480,3483,3502,3525,3532,3534,3540,3719,3884,3972,4086]
3> lists:sort([element(1,timer:tc(test,run,[])) || _ <- lists:seq(1,15) ]).
[3190,3231,3279,3348,3359,3365,3419,3461,3474,3574,3685,3779,3990,4213,4446]
Things to notice
- loading native code takes 3x more time (295080 vs 101111)
- one can argue that Ulf's benchmark is indeed a random-number
generator, but one can more or less claim that:
- Times for BEAM are in the range [4591 - 5000]
- Times for HiPE are in the range [3190 - 3700]
2. ON x86
-----------
@fan [~/HiPE/tests/uffe] uname -a
Linux fan.it.uu.se 2.4.20-20.9custom #1 SMP Tue Nov 4 21:55:46 CET 2003 i686 i686 i386 GNU/Linux
@fan [~/HiPE/tests/uffe] ~/HiPE/otp-x86/bin/erlc *.erl
@fan [~/HiPE/tests/uffe] ~/HiPE/otp-x86/bin/erl
Erlang (BEAM) emulator version 5.4.2003.10.26 [source] [hipe]
Eshell V5.4.2003.10.26 (abort with ^G)
1> lists:sort([element(1,timer:tc(test,run,[])) || _ <- lists:seq(1,15) ]).
[1304,1316,1319,1321,1328,1332,1332,1353,1356,1369,1392,1422,1439,1619,14387]
2> lists:sort([element(1,timer:tc(test,run,[])) || _ <- lists:seq(1,15) ]).
[1275,1276,1292,1316,1328,1338,1348,1360,1376,1380,1384,1409,1421,1461,1590]
3> lists:sort([element(1,timer:tc(test,run,[])) || _ <- lists:seq(1,15) ]).
[1260,1261,1270,1291,1304,1318,1327,1342,1348,1348,1359,1393,1413,1483,1522]
4> halt().
@fan [~/HiPE/tests/uffe] ~/HiPE/otp-x86/bin/erlc +native *.erl
@fan [~/HiPE/tests/uffe] ~/HiPE/otp-x86/bin/erl
Erlang (BEAM) emulator version 5.4.2003.10.26 [source] [hipe]
Eshell V5.4.2003.10.26 (abort with ^G)
1> lists:sort([element(1,timer:tc(test,run,[])) || _ <- lists:seq(1,15) ]).
[843,865,866,873,880,894,897,902,908,917,919,923,951,979,52391]
2> lists:sort([element(1,timer:tc(test,run,[])) || _ <- lists:seq(1,15) ]).
[863,872,917,920,934,941,957,958,959,960,975,983,1043,1078,1357]
3> lists:sort([element(1,timer:tc(test,run,[])) || _ <- lists:seq(1,15) ]).
[857,859,870,871,881,884,895,906,911,917,931,936,950,962,1091]
More or less we get a similar picture here.
- Times for BEAM are in the range [1260 - 1600]
- Times for HiPE are in the range [ 850 - 1100]
Some more comments:
- The benchmark reads data from a file which is handled as a stream
Performing I/O can be give a big flactuation in times. Ideally,
the benchmark should be re-written so that the data is read once
from the file (converted to a list or binary), and the time to
process the data is reported.
- timer:tc is NOT the best possible way to measure time;
ideally, some more accurate time measurements should be used.
Kostis
More information about the erlang-questions
mailing list