[erlang-questions] Strange optimization result

Sat Oct 20 16:08:41 CEST 2007

Hi
I am playing around with Tim Bray's widefinder problem.
Actually I am playing with Steve V's latest version
see http://steve.vinoski.net/blog/2007/10/18/ok-just-one-more-wf/

In it he is using a dict for Boyer-Moore searching, the dict is just
a map from char code to shift amount i.e. the keys are 1 - 255
and the values are ints.
I figured that it would be faster to just have a tuple with the shift values.
So I changed the code to generate a tuple instead of a dict. And the
lookups changed from
dict:fetch(C1, Tbl)
to
element(C1,Tbl)

To my big surprise the program now runs slower.
On a file with 1M rows approx. 192 M.

Steve's original code, using dics, as reported by time
real    0m20.444s
user    0m37.718s
sys     0m0.612s

My modified, using a tuple
real    0m25.849s
user    0m44.203s
sys     0m3.332s

Interesting to note is the high value for sys in my version.
Also the tuple version does not manage to max out the CPU use
it peaks at ~85%, while the dict version gets 100% CPU load.
The program starts 64 erlang processes on my dual core laptop.

My only idea is that this is caused by some pathological CPU cache
behaviour. Does anyone have a better explanation?

/Anders