[erlang-questions] Erlang read file benchmark

Joe Armstrong erlang@REDACTED
Sun Jul 10 15:13:05 CEST 2011


How large is the file you want to read lines from?

All the files I want to process are small (ie in relation to memory)
I don't think I've ever called read_line to read a file since I know
its rather slow
I call file:read_file read the entire file into a binary then chunk it
into lines
later.

/Joe


On Sun, Jul 10, 2011 at 1:33 AM, Bob Ippolito <bob@REDACTED> wrote:
> I think what most people want (especially for benchmarks) is something
> that doesn't care about encodings and doesn't have a lot of
> indirection. The current solution is VERY flexible, which comes at a
> severe cost to performance in this case.
>
> If you read the source code you'll see that file:read_file/1 calls
> into io:request/2 which eventually (in another process) ends up in
> file_io_server:io_request/2 and ends up reading either 128 bytes or
> 8kb at a time, doing some unicode junk, and ends up calling
> io_lib:collect_line/4 to collect each line chunk at a time.
>
> If I was trying to win a benchmark I'd probably go directly to
> prim_file, do my own buffering, and use erlang:decode_packet/3 or the
> binary module to split on the newlines. If I wanted to make a nicer
> API I'd put that in a process to manage the buffering.
>
> On Sat, Jul 9, 2011 at 4:08 PM, Kenny Stone <kennethstone@REDACTED> wrote:
>> Why is it awful?
>>
>> On Sat, Jul 9, 2011 at 6:07 PM, Bob Ippolito <bob@REDACTED> wrote:
>>>
>>> file:read_line does some pretty awful things, I'd expect it to be very
>>> slow. That said, there should be a much faster yet still easy way to
>>> do this quickly but there isn't one baked into OTP that I know of.
>>>
>>> On Saturday, July 9, 2011, Michael Truog <mjtruog@REDACTED> wrote:
>>> > He only showed the results on the command-line.  It would be nice to see
>>> > results that show runtime without the startup/teardown overhead that the
>>> > Erlang VM has, since it has a lot more going on than the perl interpreter.
>>> >  I know he briefly mentioned that the difference seemed minimal, but he
>>> > posted no results to show that.
>>> >
>>> > On 07/09/2011 12:15 PM, Evans, Matthew wrote:
>>> >> Sorry if this is a duplicate email.
>>> >>
>>> >> I can understand Erlang being a bit slower than Perl for this. Can't
>>> >> see an excuse for such a difference though.
>>> >>
>>> >> http://agentzh.org/#ErlangFileReadLineBenchmark
>>> >>
>>> >> Matt
>>> >>
>>> >> Sent from my iPhone
>>> >> _______________________________________________
>>> >> erlang-questions mailing list
>>> >> erlang-questions@REDACTED
>>> >> http://erlang.org/mailman/listinfo/erlang-questions
>>> >>
>>> >
>>> > _______________________________________________
>>> > erlang-questions mailing list
>>> > erlang-questions@REDACTED
>>> > http://erlang.org/mailman/listinfo/erlang-questions
>>> >
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>



More information about the erlang-questions mailing list