The Computer Language Shootout
Ulf Wiger (AL/EAB)
ulf.wiger@REDACTED
Fri Mar 24 14:50:32 CET 2006
Ulf Wiger wrote:
>
> I noticed that the gcc code was compiled with '-O3',
> and the ocaml entry with '-noassert -unsafe -ccopt O3'.
> Not that I know what all that means, but it sure sounds
> like they are squeezing that little extra umph out of
> their programs.
So I did do some eprof profiling:
2> eprof:total_analyse().
FUNCTION CALLS TIME
knucleotide:gen_freq/5 349968 34 %
knucleotide:update_counter/3 349961 30 %
ets:update_counter/3 349961 25 %
ets:insert/2 104033 8 %
knucleotide:to_upper_no_nl/2 51668 1 %
ets:db_delete/1 15 1 %
io:request/2 1699 0 %
... and so on.
Adjusting the benchmark slightly so that it reads
the data from file instead (basically two entry
points
main() ->
Seq = dna_seq(stdin),
calc(Seq),
halt(0).
from_file(F) ->
{ok, Fd} = file:open(F, [read]),
Seq = dna_seq(Fd),
file:close(Fd),
calc(Seq).
And then changing dna_seq() to
dna_seq(Fd) -> seek_three(Fd), dna_seq(Fd, []).
dna_seq(Fd, Seq) ->
case io:get_line(Fd,'') of
eof -> list_to_binary(lists:reverse(Seq));
Line -> Uline = to_upper_no_nl(Line),
dna_seq(Fd, [Uline|Seq])
end.
and so on, mainly to make it easier to measure...
I also removed the io:fwrite() calls and simply
used lists:map/2 to collect the results.
Compiling just gen_freq/5 to native gave very little
(time went down from 1.12 sec to 1.16 sec (ca 3%),
but compiling both gen_freq/5 and update_counter/3
gave significant speedup. Time now went down to
0.66 sec. (Commenting out the calls to gen_freq/5
left about 100 msec, which is probably not worth
trying to optimise.)
Comparing the different compilation options:
normal: 1.22 sec
native: 0.64 sec
native+o3: 0.64 sec
selective: 0.66 sec (gen_freq/5 and update_counter/1)
Putting back all printouts, I can't see any major
difference between non-native and native.
This is quite interesting, as the total time
reported is ca 1.48 secs. There *should* be a
noticeable difference.
Final experiment:
$> cp $OTP_ROOT/lib/stdlib-1.13.10/src/lists.erl .
$> cp $OTP_ROOT/lib/stdlib-1.13.10/src/io.erl .
$> cp $OTP_ROOT/lib/stdlib-1.13.10/src/io_lib* .
$> ls -1 *.erl
io.erl
io_lib.erl
io_lib_format.erl
io_lib_fread.erl
io_lib_pretty.erl
knucleotide.erl
lists.erl
$> erlc -W +native *.erl
Rerunning again, I get 1.04 secs - a 30% speedup.
What's likely to be causing problems are the
transitions between native and non-native code,
since many of the shootout benchmarks are I/O
heavy.
BR,
Ulf W
More information about the erlang-questions
mailing list