[erlang-questions] Strange optimization result
Mon Oct 22 15:00:36 CEST 2007
On 10/22/07, Thomas Lindgren <> wrote:
> --- Steve Vinoski <> wrote:
> > Anders sent me his code and I ran it on my 8-core
> > Linux box, with the
> > following performance results. VICTORY is right! :-)
> > real 0m1.904s
> > user 0m7.917s
> > sys 0m1.185s
> > Like I mentioned to Anders in private email, it's
> > nice to have someone more
> > experienced with Erlang finally taking a look at
> > this; I'm still a relative
> > newbie.
> > One thing I've liked about this entire exercise is
> > that the early attempts
> > at solving Tim Bray's Wide Finder in Erlang were
> > taking minutes to execute
> > and were providing only partial answers. Several of
> > us then started
> > whittling away at it, and because of the richness of
> > the language, we had a
> > variety of different avenues to explore. Over time,
> > we've vastly increased
> > the performance of our solutions. Anders's solution
> > now beats Ruby on the
> > same machine by about 0.3s, and because of the way
> > it uses multiple cores,
> > it will likely execute extremely quickly and
> > efficiently when Tim gets a
> > chance to try it on his T5120.
> > Yes, fast solutions in other languages were quickly
> > found, but those had
> > almost nowhere to go beyond their initial forms in
> > terms of improvement, not
> > because they were already so fast, but because the
> > languages ran out of
> > alternatives. This is especially true when it comes
> > to taking advantage of
> > the T5120's many cores. I'm a fan of many languages,
> > including Ruby, Python,
> > Perl, and C++, all of which have figured prominently
> > in the collection of
> > various Wide Finder solutions. But for my money,
> > Erlang has fulfilled Tim's
> > original wishes the best, which is to take the best
> > possible advantage of
> > all those cores.
> Well done, everyone. You've chewed this over pretty
> well, I'd say, and it has been interesting to see how
> things have improved over time. Here are a couple of
> thoughts on further improvements:
> 1. Native code compilation? It's a bit hit-and-miss,
> but this could be the sort of problem that gains from
> 2. The speedup is 4.15 on 8 cores (if I'm reading
> things right: user/real). What is the bottleneck? Too
> small input, too much I/O, or is there something that
> could be improved or tuned further?
> And for language fans, the language itself could be a
> bit more helpful. While we haven't emphasized regexp
> crunching, it still seems like things could be easier.
> Here are some quick thoughts.
> 0. Not having to write the Boyer-Moore stuff by hand
> 1. Working with binaries like strings should be easier
> 2. Reading lines from files
> 3. Appropriate data chunking for file processing
> (people tried all from a few KB to several MB per
> chunk -- could the system figure out an appropriate
> size on its own?)
> 4. Perhaps a streaming interface would be even better?
> Jay Nelson suggested one a few years ago.
> 5. When looking at the use of dictionaries, it struck
> me that ets:update_counter/3 could have avoided the
> use of dictionaries and merging altogether. But, alas,
> there is the well-known snag that if the key does not
> exist in the table, you need to insert it yourself.
> Which means you get a race condition that, as far as I
> can see, ets can't handle safely. (Or could I be more
> 6. More intuitive APIs -- it's been instructive, and a
> bit alarming, to see how people outside the "core"
> community have had to struggle with false starts on
> this. More and better documentation and tutorials.
I did the change to ets:update_counter last night, on my dual
core laptop that made an improvement from my previously
To avoid the update_counter race condition I have only one process
that all workers report their matches to.
I had some problems with native compilation but finally made it work
More information about the erlang-questions