[erlang-questions] Strange optimization result

Caoyuan dcaoyuan@REDACTED
Mon Oct 22 12:05:52 CEST 2007


It seems GC is also a key for performance now :-) per my tbray4.erl
code, which uses plain Dict

After I added [{min_heap_size, 3000}] to spawn_opt, the elapsed time
dropped from 7.7 sec to 5.7s immediately.

The code is at:
http://blogtrader.net/page/dcaoyuan/entry/learning_coding_parallelization_was_tim


On 10/22/07, Thomas Lindgren <thomasl_erlang@REDACTED> wrote:
>
> --- Steve Vinoski <vinoski@REDACTED> wrote:
>
> > Anders sent me his code and I ran it on my 8-core
> > Linux box, with the
> > following performance results. VICTORY is right! :-)
> >
> > real    0m1.904s
> > user    0m7.917s
> > sys     0m1.185s
> >
> > Like I mentioned to Anders in private email, it's
> > nice to have someone more
> > experienced with Erlang finally taking a look at
> > this; I'm still a relative
> > newbie.
> > One thing I've liked about this entire exercise is
> > that the early attempts
> > at solving Tim Bray's Wide Finder in Erlang were
> > taking minutes to execute
> > and were providing only partial answers. Several of
> > us then started
> > whittling away at it, and because of the richness of
> > the language, we had a
> > variety of different avenues to explore. Over time,
> > we've vastly increased
> > the performance of our solutions. Anders's solution
> > now beats Ruby on the
> > same machine by about 0.3s, and because of the way
> > it uses multiple cores,
> > it will likely execute extremely quickly and
> > efficiently when Tim gets a
> > chance to try it on his T5120.
> >
> > Yes, fast solutions in other languages were quickly
> > found, but those had
> > almost nowhere to go beyond their initial forms in
> > terms of improvement, not
> > because they were already so fast, but because the
> > languages ran out of
> > alternatives. This is especially true when it comes
> > to taking advantage of
> > the T5120's many cores. I'm a fan of many languages,
> > including Ruby, Python,
> > Perl, and C++, all of which have figured prominently
> > in the collection of
> > various Wide Finder solutions. But for my money,
> > Erlang has fulfilled Tim's
> > original wishes the best, which is to take the best
> > possible advantage of
> > all those cores.
>
> Well done, everyone. You've chewed this over pretty
> well, I'd say, and it has been interesting to see how
> things have improved over time. Here are a couple of
> thoughts on further improvements:
>
> 1. Native code compilation? It's a bit hit-and-miss,
> but this could be the sort of problem that gains from
> it.
>
> 2. The speedup is 4.15 on 8 cores (if I'm reading
> things right: user/real). What is the bottleneck? Too
> small input, too much I/O, or is there something that
> could be improved or tuned further?
>
> And for language fans, the language itself could be a
> bit more helpful. While we haven't emphasized regexp
> crunching, it still seems like things could be easier.
> Here are some quick thoughts.
>
> 0. Not having to write the Boyer-Moore stuff by hand
>
> 1. Working with binaries like strings should be easier
>
> 2. Reading lines from files
>
> 3. Appropriate data chunking for file processing
> (people tried all from a few KB to several MB per
> chunk -- could the system figure out an appropriate
> size on its own?)
>
> 4. Perhaps a streaming interface would be even better?
> Jay Nelson suggested one a few years ago.
>
> 5. When looking at the use of dictionaries, it struck
> me that ets:update_counter/3 could have avoided the
> use of dictionaries and merging altogether. But, alas,
> there is the well-known snag that if the key does not
> exist in the table, you need to insert it yourself.
> Which means you get a race condition that, as far as I
> can see, ets can't handle safely. (Or could I be more
> creative?)
>
> 6. More intuitive APIs -- it's been instructive, and a
> bit alarming, to see how people outside the "core"
> community have had to struggle with false starts on
> this. More and better documentation and tutorials.
>
> Best,
> Thomas
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>


-- 
- Caoyuan



More information about the erlang-questions mailing list