[erlang-questions] widefinder update
Thomas Lindgren
thomasl_erlang@REDACTED
Thu Nov 1 15:14:49 CET 2007
--- Anders Nygren <anders.nygren@REDACTED> wrote:
> On 10/29/07, Thomas Lindgren
> <thomasl_erlang@REDACTED> wrote:
> >
> > --- Steve Vinoski <vinoski@REDACTED> wrote:
> >
> > > On 10/28/07, Anders Nygren
> <anders.nygren@REDACTED>
> > > wrote:
> > > >
> > > > On 10/28/07, Hynek Vychodil
> > > <vychodil.hynek@REDACTED> wrote:
> > > > > Hi Anders,
> > > > > I rewrote your code a little. I removed all
> > > remaining binary bindings
> > > > > and it is noticeable faster again. Try
> > > wf_pichi3.erl.
> > > > >
> > > >
> > > > Hynek
> > > > that was great, Your change brings my
> wfinder1_1 +
> > > wfbm4_ets1_1
> > > > down to
> > > > real 0m1.118s
> > > > user 0m1.640s
> > > > sys 0m0.368s
> > > > on my 1.66 GHz dual core laptop.
> > >
> > >
> > > And on the 8-core 2.33GHz Intel Xeon Linux box
> with
> > > 2 GB RAM, this version
> > > is extremely fast:
> > >
> > > real 0m0.567s
> > > user 0m2.249s
> > > sys 0m0.956s
> >
> > (I'll ignore the unexplained sys time below. That
> > makes the discussion a bit preliminary; perhaps
> the
> > derived results should be computed some other way.
> > Apply grain of salt appropriately.)
> >
> > For those keeping track, the latest result is
> fully
> > 2.7 times faster than the best previous version
> (which
> > was block read), and 17.3 times faster than the
> > initial version. The latest speedup is basically
> due
> > to doing less work.
>
> Are we doing less work?
> Not really, over 3 versions wfinder, wfinder1 and
> wfinder1_1
> the only differences are how we do binary matching
> - do NOT create sub binaries
> - keep "pointers" for offsets inside one big binary
> The reason that it get so much faster seems to be
> that
> we have much less garbage collection.
Just to be clear, here I put memory allocation and
reclamation in the category of 'work'. So doing less
work in this sense means being more efficient for the
same data set. In the case of widefinder, the user
time has steadily decreased with new versions, which I
take to indicate precisely this.
> - When doing folds over large binaries do not
> repetedly
> split the binary in head and tail parts for
> recursive calls.
> Keep an offset counter to track the position in the
> binary.
Yes, this pitfall turns up time and again (along with
the issue of repeatedly appending data to binaries
:-).
> - One reason that it is even possible to solve the
> wide finder
> in parallel is that it is possible to "resync" at a
> random place
> in the file by locating a newline.
Right, good point. The data structure, or file format
in this case, has to lend itself to parallelization
too. Text works in the ordinary case because single
lines are nearly always short, so you won't run into
problems looking for the end of line. So you can chop
the file into chunks and 'repair' it into lines
afterwards. (In contrast, we have other formats like
BER, or in my case, diameter packets that can be 2^24
bytes.)
Other approaches might be to start the file with a
directory of sections to speed up traversal. (Even
simpler, though it defeats the purpose of this
exercise: generate many files of reasonable size and
process each one independently ...)
Best,
Thomas
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
More information about the erlang-questions
mailing list