[erlang-questions] Slow line oriented IO

Lukas Larsson <>
Mon Feb 18 14:25:47 CET 2013


Great! All tests over the weekend seem to have passed for the patch so it
will be part of R16B.

Lukas


On Sat, Feb 16, 2013 at 12:57 AM, Hynek Vychodil
<>wrote:

> It works like charm. I have patched R15B03 ad with 1MB read ahead it
> is unbelievable fast
>
> 20> timer:tc(line_read_test,read_std,["test.txt"]).
> {30732,2408}
>
> It is even four times faster than port version
>
> 24> timer:tc(line_read_test,read_port,["test.txt"]).
> {135676,2408}
>
> It is also faster without async threads. Measurements above was with
> +A 10. Without async threads:
>
> 2> timer:tc(line_read_test,read_std,["test.txt"]).
> {19322,2408}
>
> I was expected time around 100ms but current behavior is far beyond
> expectation. I hope this patch will find way into upcoming R16. It
> seems to speed up line oriented io for "normal" files two times also.
>
> Thanks a lot
>    Hynek Vychodil
>
> On Fri, Feb 15, 2013 at 4:58 PM, Lukas Larsson
> <> wrote:
> > Oh, and here are the benchmarks for me now:
> >
> > R15B03:
> >
> > 1> timer:tc(line_read_test,read_std,["test.txt"]).
> > {21307874,14448}
> >
> > R16B with patch:
> >
> > 1> timer:tc(line_read_test,read_std,["test.txt"]).
> > {91885,14448}
> >
> > It's not every day you find a two line change which increases speed by a
> > factor of 232!
> >
> > Lukas
> >
> >
> > On Fri, Feb 15, 2013 at 4:42 PM, Lukas Larsson <
> >
> > wrote:
> >>
> >> aha, you needed to have very long lines in the file to provoke the
> >> behaviour. The file I was experimenting on had the same line length all
> the
> >> time.
> >>
> >> I think I've found the culprit. Aparently the read_ahead option was
> >> ignored after the first buffer had been consumed. Try the patch below
> and
> >> see if it works for you:
> >>
> >> From 0884480691e047801cef117b19151a7d34c9e438 Mon Sep 17 00:00:00 2001
> >> From: Lukas Larsson <>
> >> Date: Fri, 15 Feb 2013 16:37:52 +0100
> >> Subject: [PATCH] invoke_read_line now remembers read_ahead settings
> >>
> >> ---
> >>  erts/emulator/drivers/common/efile_drv.c |    2 ++
> >>  1 file changed, 2 insertions(+)
> >>
> >> diff --git a/erts/emulator/drivers/common/efile_drv.c
> >> b/erts/emulator/drivers/common/efile_drv.c
> >> index 2279fec..d81f116 100644
> >> --- a/erts/emulator/drivers/common/efile_drv.c
> >> +++ b/erts/emulator/drivers/common/efile_drv.c
> >> @@ -1160,6 +1160,8 @@ static void invoke_read_line(void *data)
> >>             /* Need more place */
> >>             ErlDrvSizeT need = (d->c.read_line.read_size >=
> >> DEFAULT_LINEBUF_SIZE) ?
> >>                 d->c.read_line.read_size + DEFAULT_LINEBUF_SIZE :
> >> DEFAULT_LINEBUF_SIZE;
> >> +         if (need < (d->c.read_line.binp)->orig_size)
> >> +           need = (d->c.read_line.binp)->orig_size;
> >>             ErlDrvBinary   *newbin = driver_alloc_binary(need);
> >>             if (newbin == NULL) {
> >>                 d->result_ok = 0;
> >> --
> >> 1.7.10.4
> >>
> >>
> >>
> >>
> >> On Fri, Feb 15, 2013 at 1:29 AM, Hynek Vychodil <
> >
> >> wrote:
> >>>
> >>> Using async thread drops time down to 4.6s which is still 20 times
> >>> slower than mine simple wrapper.
> >>>
> >>> On Thu, Feb 14, 2013 at 4:25 PM, Lukas Larsson <>
> wrote:
> >>> > Hello!
> >>> >
> >>> > If you start erlang with an async thread (erl +A 1) you will see
> >>> > radically
> >>> > different behaviour. For me the time it takes to use read_line drops
> >>> > from
> >>> > 58s to 2s. When doing file i/o you should always have some async
> >>> > threads to
> >>> > help you do the work, this is why in R16B we changed the default
> from 0
> >>> > to
> >>> > 10 async threads.
> >>> >
> >>> > Lukas
> >>> >
> >>> >
> >>> > On Thu, Feb 14, 2013 at 3:46 PM, Hynek Vychodil
> >>> > <>
> >>> > wrote:
> >>> >>
> >>> >> Hello,
> >>> >> I know it was been already discussed here in list and it is also
> >>> >> recurring
> >>> >> topic for at least five years. But anyway I have been bitten by it
> >>> >> again
> >>> >> and
> >>> >> also found pretty pathological case. I have 30MB text file and it
> has
> >>> >> a
> >>> >> few
> >>> >> near to 1MB lines there. (I can provide file with same line lengths
> if
> >>> >> somebody interested.) What I have been observed is that reading this
> >>> >> file
> >>> >> using raw file:read_line/1 takes 51s! For comparison I have tried
> some
> >>> >> different approaches and what I got (line_read_test:read_std/1 is
> >>> >> using
> >>> >> file:read_line/1):
> >>> >>
> >>> >> 1> timer:tc(line_read_test,read_std,["test.txt"]).
> >>> >> {51028105,2408}
> >>> >> 2> timer:tc(line_read_test,read,["test.txt"]).
> >>> >> {226220,2408}
> >>> >> 3> timer:tc(line_read_test,read_port,["test.txt"]).
> >>> >> {139388,2408}
> >>> >>
> >>> >> $time perl -nE'$i++}{say $i' test.txt
> >>> >> 2408
> >>> >>
> >>> >> real    0m0.053s
> >>> >> user    0m0.044s
> >>> >> sys     0m0.008s
> >>> >>
> >>> >> $ time wc -l test.txt
> >>> >> 2408 test.txt
> >>> >>
> >>> >> real    0m0.013s
> >>> >> user    0m0.004s
> >>> >> sys     0m0.008s
> >>> >>
> >>> >> $ time ./a.out test.txt
> >>> >> 2408
> >>> >>
> >>> >> real    0m0.020s
> >>> >> user    0m0.012s
> >>> >> sys     0m0.008s
> >>> >>
> >>> >> It means erlang should be at least 225 times faster
> >>> >> (line_read_test:read/1
> >>> >> which has flow control). Erlang can be 350 times faster
> >>> >> (line_read_test:read_port/1 without flow control). Another high
> level
> >>> >> language (perl) is almost thousand times faster. Special C program
> is
> >>> >> almost
> >>> >> four thousands times faster and old good glibc is two and half
> >>> >> thousands
> >>> >> times faster. Come on guys it is not even fun when simple (and
> wrong)
> >>> >> erlang
> >>> >> wrapper around standard module is more than two order of magnitude
> >>> >> faster.
> >>> >> From mine experience when there is something two orders of magnitude
> >>> >> slower
> >>> >> it tells me there is something damn wrong. I have looked into
> >>> >> efile_drv.c
> >>> >> and
> >>> >> ti is unfortunately far beyond mine C skill but if simple buffering
> >>> >> and
> >>> >> binary:match/2 can outperform it 225 times there has to be something
> >>> >> rotten
> >>> >> in this code.
> >>> >>
> >>> >> I have also experimented with read_ahead option in file:open and
> >>> >> changing
> >>> >> to
> >>> >> less value makes thing worse.
> >>> >>
> >>> >> Just to make grasp how bad it is, in same time I'm able sort 150
> >>> >> million
> >>> >> 64bit values (1.2GB of data) three times (one CPU core same HW). It
> is
> >>> >> not
> >>> >> in
> >>> >> flow control, mine simple wrapper does flow control too. It can't
> make
> >>> >> current code less intrusive, if it consumes 100% CPU for 51s instead
> >>> >> of
> >>> >> 226ms
> >>> >> then it will definitely affect whole server. It is not in concurrent
> >>> >> access,
> >>> >> mine code allows concurrent access too. Admitting there is something
> >>> >> broken
> >>> >> is first step to fixing it. I hope I helped at least in this way.
> >>> >>
> >>> >> With best regards
> >>> >>   Hynek Vychodil
> >>> >>
> >>> >> _______________________________________________
> >>> >> erlang-questions mailing list
> >>> >> 
> >>> >> http://erlang.org/mailman/listinfo/erlang-questions
> >>> >>
> >>> >
> >>> _______________________________________________
> >>> erlang-questions mailing list
> >>> 
> >>> http://erlang.org/mailman/listinfo/erlang-questions
> >>
> >>
> >
> _______________________________________________
> erlang-questions mailing list
> 
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130218/490f2e5d/attachment.html>


More information about the erlang-questions mailing list