[erlang-questions] Strange optimization result

Robert Virding rvirding@REDACTED
Mon Oct 22 23:13:57 CEST 2007


Actually it is a surprise to me that the dict version compares so favourably
to both the tuple version and the ets version considering that it is
implemented wholly in Erlang while the others are implemented in C in the
emulator. It must have got something right. :-)

It might help a lot if there was support for it in the emulator. Not
completely implement it there but have some support functions. There are
some parts which would be difficult to do in the emulator. Future EEP?

Robert

On 22/10/2007, Caoyuan <dcaoyuan@REDACTED> wrote:
>
> It seems GC is also a key for performance now :-) per my tbray4.erl
> code, which uses plain Dict
>
> After I added [{min_heap_size, 3000}] to spawn_opt, the elapsed time
> dropped from 7.7 sec to 5.7s immediately.
>
> The code is at:
>
> http://blogtrader.net/page/dcaoyuan/entry/learning_coding_parallelization_was_tim
>
>
> On 10/22/07, Thomas Lindgren <thomasl_erlang@REDACTED> wrote:
> >
> > --- Steve Vinoski <vinoski@REDACTED> wrote:
> >
> > > Anders sent me his code and I ran it on my 8-core
> > > Linux box, with the
> > > following performance results. VICTORY is right! :-)
> > >
> > > real    0m1.904s
> > > user    0m7.917s
> > > sys     0m1.185s
> > >
> > > Like I mentioned to Anders in private email, it's
> > > nice to have someone more
> > > experienced with Erlang finally taking a look at
> > > this; I'm still a relative
> > > newbie.
> > > One thing I've liked about this entire exercise is
> > > that the early attempts
> > > at solving Tim Bray's Wide Finder in Erlang were
> > > taking minutes to execute
> > > and were providing only partial answers. Several of
> > > us then started
> > > whittling away at it, and because of the richness of
> > > the language, we had a
> > > variety of different avenues to explore. Over time,
> > > we've vastly increased
> > > the performance of our solutions. Anders's solution
> > > now beats Ruby on the
> > > same machine by about 0.3s, and because of the way
> > > it uses multiple cores,
> > > it will likely execute extremely quickly and
> > > efficiently when Tim gets a
> > > chance to try it on his T5120.
> > >
> > > Yes, fast solutions in other languages were quickly
> > > found, but those had
> > > almost nowhere to go beyond their initial forms in
> > > terms of improvement, not
> > > because they were already so fast, but because the
> > > languages ran out of
> > > alternatives. This is especially true when it comes
> > > to taking advantage of
> > > the T5120's many cores. I'm a fan of many languages,
> > > including Ruby, Python,
> > > Perl, and C++, all of which have figured prominently
> > > in the collection of
> > > various Wide Finder solutions. But for my money,
> > > Erlang has fulfilled Tim's
> > > original wishes the best, which is to take the best
> > > possible advantage of
> > > all those cores.
> >
> > Well done, everyone. You've chewed this over pretty
> > well, I'd say, and it has been interesting to see how
> > things have improved over time. Here are a couple of
> > thoughts on further improvements:
> >
> > 1. Native code compilation? It's a bit hit-and-miss,
> > but this could be the sort of problem that gains from
> > it.
> >
> > 2. The speedup is 4.15 on 8 cores (if I'm reading
> > things right: user/real). What is the bottleneck? Too
> > small input, too much I/O, or is there something that
> > could be improved or tuned further?
> >
> > And for language fans, the language itself could be a
> > bit more helpful. While we haven't emphasized regexp
> > crunching, it still seems like things could be easier.
> > Here are some quick thoughts.
> >
> > 0. Not having to write the Boyer-Moore stuff by hand
> >
> > 1. Working with binaries like strings should be easier
> >
> > 2. Reading lines from files
> >
> > 3. Appropriate data chunking for file processing
> > (people tried all from a few KB to several MB per
> > chunk -- could the system figure out an appropriate
> > size on its own?)
> >
> > 4. Perhaps a streaming interface would be even better?
> > Jay Nelson suggested one a few years ago.
> >
> > 5. When looking at the use of dictionaries, it struck
> > me that ets:update_counter/3 could have avoided the
> > use of dictionaries and merging altogether. But, alas,
> > there is the well-known snag that if the key does not
> > exist in the table, you need to insert it yourself.
> > Which means you get a race condition that, as far as I
> > can see, ets can't handle safely. (Or could I be more
> > creative?)
> >
> > 6. More intuitive APIs -- it's been instructive, and a
> > bit alarming, to see how people outside the "core"
> > community have had to struggle with false starts on
> > this. More and better documentation and tutorials.
> >
> > Best,
> > Thomas
> >
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam?  Yahoo! Mail has the best spam protection around
> > http://mail.yahoo.com
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://www.erlang.org/mailman/listinfo/erlang-questions
> >
>
>
> --
> - Caoyuan
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20071022/12488c82/attachment.htm>


More information about the erlang-questions mailing list