[erlang-questions] Slow line oriented IO

Hynek Vychodil vychodil.hynek@REDACTED
Sat Feb 16 00:57:39 CET 2013


It works like charm. I have patched R15B03 ad with 1MB read ahead it
is unbelievable fast

20> timer:tc(line_read_test,read_std,["test.txt"]).
{30732,2408}

It is even four times faster than port version

24> timer:tc(line_read_test,read_port,["test.txt"]).
{135676,2408}

It is also faster without async threads. Measurements above was with
+A 10. Without async threads:

2> timer:tc(line_read_test,read_std,["test.txt"]).
{19322,2408}

I was expected time around 100ms but current behavior is far beyond
expectation. I hope this patch will find way into upcoming R16. It
seems to speed up line oriented io for "normal" files two times also.

Thanks a lot
   Hynek Vychodil

On Fri, Feb 15, 2013 at 4:58 PM, Lukas Larsson
<lukas@REDACTED> wrote:
> Oh, and here are the benchmarks for me now:
>
> R15B03:
>
> 1> timer:tc(line_read_test,read_std,["test.txt"]).
> {21307874,14448}
>
> R16B with patch:
>
> 1> timer:tc(line_read_test,read_std,["test.txt"]).
> {91885,14448}
>
> It's not every day you find a two line change which increases speed by a
> factor of 232!
>
> Lukas
>
>
> On Fri, Feb 15, 2013 at 4:42 PM, Lukas Larsson <lukas@REDACTED>
> wrote:
>>
>> aha, you needed to have very long lines in the file to provoke the
>> behaviour. The file I was experimenting on had the same line length all the
>> time.
>>
>> I think I've found the culprit. Aparently the read_ahead option was
>> ignored after the first buffer had been consumed. Try the patch below and
>> see if it works for you:
>>
>> From 0884480691e047801cef117b19151a7d34c9e438 Mon Sep 17 00:00:00 2001
>> From: Lukas Larsson <lukas@REDACTED>
>> Date: Fri, 15 Feb 2013 16:37:52 +0100
>> Subject: [PATCH] invoke_read_line now remembers read_ahead settings
>>
>> ---
>>  erts/emulator/drivers/common/efile_drv.c |    2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/erts/emulator/drivers/common/efile_drv.c
>> b/erts/emulator/drivers/common/efile_drv.c
>> index 2279fec..d81f116 100644
>> --- a/erts/emulator/drivers/common/efile_drv.c
>> +++ b/erts/emulator/drivers/common/efile_drv.c
>> @@ -1160,6 +1160,8 @@ static void invoke_read_line(void *data)
>>             /* Need more place */
>>             ErlDrvSizeT need = (d->c.read_line.read_size >=
>> DEFAULT_LINEBUF_SIZE) ?
>>                 d->c.read_line.read_size + DEFAULT_LINEBUF_SIZE :
>> DEFAULT_LINEBUF_SIZE;
>> +         if (need < (d->c.read_line.binp)->orig_size)
>> +           need = (d->c.read_line.binp)->orig_size;
>>             ErlDrvBinary   *newbin = driver_alloc_binary(need);
>>             if (newbin == NULL) {
>>                 d->result_ok = 0;
>> --
>> 1.7.10.4
>>
>>
>>
>>
>> On Fri, Feb 15, 2013 at 1:29 AM, Hynek Vychodil <vychodil.hynek@REDACTED>
>> wrote:
>>>
>>> Using async thread drops time down to 4.6s which is still 20 times
>>> slower than mine simple wrapper.
>>>
>>> On Thu, Feb 14, 2013 at 4:25 PM, Lukas Larsson <lukas@REDACTED> wrote:
>>> > Hello!
>>> >
>>> > If you start erlang with an async thread (erl +A 1) you will see
>>> > radically
>>> > different behaviour. For me the time it takes to use read_line drops
>>> > from
>>> > 58s to 2s. When doing file i/o you should always have some async
>>> > threads to
>>> > help you do the work, this is why in R16B we changed the default from 0
>>> > to
>>> > 10 async threads.
>>> >
>>> > Lukas
>>> >
>>> >
>>> > On Thu, Feb 14, 2013 at 3:46 PM, Hynek Vychodil
>>> > <vychodil.hynek@REDACTED>
>>> > wrote:
>>> >>
>>> >> Hello,
>>> >> I know it was been already discussed here in list and it is also
>>> >> recurring
>>> >> topic for at least five years. But anyway I have been bitten by it
>>> >> again
>>> >> and
>>> >> also found pretty pathological case. I have 30MB text file and it has
>>> >> a
>>> >> few
>>> >> near to 1MB lines there. (I can provide file with same line lengths if
>>> >> somebody interested.) What I have been observed is that reading this
>>> >> file
>>> >> using raw file:read_line/1 takes 51s! For comparison I have tried some
>>> >> different approaches and what I got (line_read_test:read_std/1 is
>>> >> using
>>> >> file:read_line/1):
>>> >>
>>> >> 1> timer:tc(line_read_test,read_std,["test.txt"]).
>>> >> {51028105,2408}
>>> >> 2> timer:tc(line_read_test,read,["test.txt"]).
>>> >> {226220,2408}
>>> >> 3> timer:tc(line_read_test,read_port,["test.txt"]).
>>> >> {139388,2408}
>>> >>
>>> >> $time perl -nE'$i++}{say $i' test.txt
>>> >> 2408
>>> >>
>>> >> real    0m0.053s
>>> >> user    0m0.044s
>>> >> sys     0m0.008s
>>> >>
>>> >> $ time wc -l test.txt
>>> >> 2408 test.txt
>>> >>
>>> >> real    0m0.013s
>>> >> user    0m0.004s
>>> >> sys     0m0.008s
>>> >>
>>> >> $ time ./a.out test.txt
>>> >> 2408
>>> >>
>>> >> real    0m0.020s
>>> >> user    0m0.012s
>>> >> sys     0m0.008s
>>> >>
>>> >> It means erlang should be at least 225 times faster
>>> >> (line_read_test:read/1
>>> >> which has flow control). Erlang can be 350 times faster
>>> >> (line_read_test:read_port/1 without flow control). Another high level
>>> >> language (perl) is almost thousand times faster. Special C program is
>>> >> almost
>>> >> four thousands times faster and old good glibc is two and half
>>> >> thousands
>>> >> times faster. Come on guys it is not even fun when simple (and wrong)
>>> >> erlang
>>> >> wrapper around standard module is more than two order of magnitude
>>> >> faster.
>>> >> From mine experience when there is something two orders of magnitude
>>> >> slower
>>> >> it tells me there is something damn wrong. I have looked into
>>> >> efile_drv.c
>>> >> and
>>> >> ti is unfortunately far beyond mine C skill but if simple buffering
>>> >> and
>>> >> binary:match/2 can outperform it 225 times there has to be something
>>> >> rotten
>>> >> in this code.
>>> >>
>>> >> I have also experimented with read_ahead option in file:open and
>>> >> changing
>>> >> to
>>> >> less value makes thing worse.
>>> >>
>>> >> Just to make grasp how bad it is, in same time I'm able sort 150
>>> >> million
>>> >> 64bit values (1.2GB of data) three times (one CPU core same HW). It is
>>> >> not
>>> >> in
>>> >> flow control, mine simple wrapper does flow control too. It can't make
>>> >> current code less intrusive, if it consumes 100% CPU for 51s instead
>>> >> of
>>> >> 226ms
>>> >> then it will definitely affect whole server. It is not in concurrent
>>> >> access,
>>> >> mine code allows concurrent access too. Admitting there is something
>>> >> broken
>>> >> is first step to fixing it. I hope I helped at least in this way.
>>> >>
>>> >> With best regards
>>> >>   Hynek Vychodil
>>> >>
>>> >> _______________________________________________
>>> >> erlang-questions mailing list
>>> >> erlang-questions@REDACTED
>>> >> http://erlang.org/mailman/listinfo/erlang-questions
>>> >>
>>> >
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>



More information about the erlang-questions mailing list