[erlang-questions] Strange performance issue - advice needed
Gmail
watson.timothy@REDACTED
Sat Apr 14 19:14:16 CEST 2012
Can't read the code well on my phone but if you're using line oriented io then that tends to suck.
I would suggest reading http://erlang.org/pipermail/erlang-questions/2012-March/065374.html as well.
Cheers,
Tim
Sent from my iPhone
On 14 Apr 2012, at 16:18, Ian <hobson42@REDACTED> wrote:
> Hi all,
>
> Nubie here, learning Erlang. Having written some awfully slow code in new languages before, I am keen to write speedy code this time.
>
> I am writing a solution to the telegram problem. In the code below, tgpack reads words, and builds lines, while tgwriter receives those lines and writes the result to a file of text. Other code, not shown, generates words and writes them as binaries, to tgpack.
>
> I am getting some very strange timings. When I ran it with 50 characters per line, it too 18.8 seconds to run. So I doubled the length of the line, thus halving the number of lines. Same input file.
>
> If the time was dominate by the line count then the time taken would be about half. If it was dominated by the line length then it would be about double. I did not think I would get a timing outside this range (9-40 seconds).
>
> It took 3.6 seconds!
>
> So I doubled the line length again, and at 200 characters per line, it took 0.6 seconds!
>
> I tried 25 chars per line, it took 122 seconds.
>
> In every case it was reading the same 1.7MB file. I also repeated the tests and the timings are constant (+/- 3%).
>
> So it appears that short lines are terribly inefficient. I want to understand why? Can anyone spot it?
>
> Thanks
>
> Ian
>
> The code for tgpack
> -module(tgpack).
> -export([pack/2]).
> %% tgpack - read words, builds and write lines to writer,
> %% where lines are as long as possible and less than Length
> %% Upon reading <<>> write it to close file, and quit.
> pack(Length,Writer) ->
> receive
> <<>> -> Writer ! <<>>; % handle empty file
> Word -> buldlist(Length,byte_size(Word),Writer,[Word])
> end.
>
> % buildList
> % Length is max length of line
> % Size is length of line so far
> % Writer is file writing process
> % List is list of binaries built up for current line in reverse order
> buldlist(Length,Size,Writer,List) ->
> receive
> <<>> ->
> Writer ! binaryFromList(List), % send last line.
> Writer ! <<>>; % close the output
> Word ->
> S = Size + 1 + byte_size(Word),
> case S >= Length of
> true -> % write old and start next line
> Writer ! binaryFromList(List),
> buldlist(Length,byte_size(Word),Writer,[Word]);
> false -> % add word to current
> buldlist(Length,S,Writer,[Word|List] )
> end
> end.
>
> the code for tgwriter
> -module(tgwriter).
> -export([writer/1]).
> %% writer - reads lines and writes them to Filename until eof
> %% eof is a zero length binary
> writer(Filename) ->
> {ok, Fh} = file:open(Filename,[write,{delayed_write, 8096, 1000}]),
> writemore(Fh).
>
> writemore(Fh) ->
> receive
> <<>> -> file:close(Fh);
> Msg -> % write line
> file:write(Fh, Msg),
> file:write(Fh, "\n"),
> writemore(Fh)
> end.
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
More information about the erlang-questions
mailing list