[erlang-questions] input too fast

Daniel Kwiecinski <>
Fri Jul 13 14:17:53 CEST 2007


Hi,

    For me it look as messages comes from port really fast. They coming fast
because eof couldn't be detected and most of the messages are {data, []}. It
is because *****PortSettings *in ***open_port(PortName, PortSettings) *have
to contain something more than *[eof]*. Like for example *[eof, {line,
1000}]*. Then eof is detected and no more data is delivered by port.

    The above is based on my observations. I am not an expert in erlang
field so please forgive my ignorance in case if I missed something obvious.

-- 
Kind Regards,
Daniel Kwiecinski
**

2007/6/29, Fredrik Svahn <>:
>
> Sorry for double posting, it seems I have misconfigured something at
> trapexit...
>
> Fredrik Svahn wrote:
> I have also been frustrated by the way the io operations work when
> attempting to speed up a few of the example programs for the language
> shootout. The reverse-complement program for instance (which is approx. 60
> times slower than the corresponding c program) spends 80% of its time
> reading from stdio, and I assume writing out the results are quite costly
> too.
>
> Just for fun I made a small patch to efile to allow for stdin/stdout to be
> opened as files. It probably has a lot of nasty side effects which I cannot
> even imagine in the worst of my nightmares, but the results for reading and
> writing are stunning. The "file" approach clocks in at 0.26 seconds for
> reading a large file from stdin and writing it to stdout. Corresponding
> results for a port program is 1.2 seconds with the normal io approach
> measuring in at 16.9 seconds.
>
> I guess reading from stdin is not much of a problem for most Erlang
> applications which are supposed to be robust scalable systems staying up for
> years. I also think that this has been discussed before, probably at great
> length, although I cannot find any relevant posts at the moment. But now
> with escript maybe it might be a bit more interesting to have fast io
> operations for stdin/stdout, at least for unix systems?
>
> I haven't looked at memory consumption, yet, but I expect the result
> should be the same as for Ulf, i.e. port programs build up large heaps if
> they cannot handle the messages really really really fast, while the file
> and normal io approach should not really consume much more memory than the
> buffer.
>
> BR /Fredrik
>
> Test-program:
>
> -module(io_test).
> -export([file/0,port/0,normal/0,fileio/0,portio/0,normalio/0]).
> -define(bufsize, 2048).
>
> file()-> io:format("~n~p~n",[timer:tc(?MODULE, fileio, [])]), halt().
> port()-> io:format("~n~p~n",[timer:tc(?MODULE, portio, [])]), halt().
> normal()-> io:format("~n~p~n",[timer:tc(?MODULE, normalio, [])]), halt().
>
> fileio()->
>     {ok,StdIn}=file:open("<stdin>",[raw, binary, read]),
>     {ok,StdOut}=file:open("<stdout>",[raw, binary, write]),
>     fileio(StdIn, StdOut).
>
> fileio(StdIn, StdOut) ->
>     case file:read(StdIn,?bufsize) of
>    eof -> ok;
>    {ok, Data} ->
>        file:write(StdOut, Data),
>        fileio(StdIn, StdOut)
>     end.
>
> portio()->
>     Port=open_port({fd, 0, 1},[eof]),
>     portio(Port),
>     port_close(Port).
>
> portio(Port)->
>     receive
>    {Port, {data, Data}} ->
>        port_command(Port, Data),
>        portio(Port);
>    {_Port, eof} -> ok
>     end.
>
> normalio() ->
>     case io:get_chars('',?bufsize) of
>    eof -> ok;
>    Data ->
>        io:put_chars(Data),
>        normalio()
>     end.
>
>
>
> Command-lines:
>
> $ erl -noinput -run io_test file < txt  > tmp-file ; tail -n 1 tmp-file
> {259951,ok}
> $ erl -noinput -run io_test port < txt  > tmp-port ; tail -n 1 tmp-port
> {1193521,true}
> $ erl -noinput -noshell -run io_test normal < txt  > tmp-normal ; tail -n
> 1 tmp-normal
> {16946068,ok}
>
>
> Patch for unix on R11B-5:
> diff ./erts/emulator/drivers/unix/unix_efile.c
> ./erts/emulator/drivers/unix/unix_efile.c.old
> 781,789c781
> <
> <     if (strcmp(name, "<stdin>") == 0) {
> <               fd = 0;
> <     } else if (strcmp(name, "<stdout>") == 0) {
> <       fd = 1;
> <     } else {
> <       fd = open(name, mode, FILE_MODE);
> <     }
> <
> ---
> >     fd = open(name, mode, FILE_MODE);
>
>
>
>
> On 6/26/07, Ulf Wiger (TN/EAB) <  > wrote:
>
> >
> > I submitted a sum-file entry to the shootout, which worked
> > nicely in my environment(*), but failed miserably in the
> > official benchmark.
> >
> > *http://shootout.alioth.debian.org/gp4/benchmark.php?test=sumcol&lang=hipe&id=2
> > *<http://shootout.alioth.debian.org/gp4/benchmark.php?test=sumcol&lang=hipe&id=2>
> >
> > It uses the (admittedly undocumented) command-line flag for
> > installing a custom user process, and opens stdin in line-
> > oriented mode.
> >
> > The problem is that it runs out of memory. As far as I can make
> > out, it's because the emulator chops up lines and sends them
> > to the process at such a high rate that, even though the
> > process is in a tight loop and doing minimal work on each item,
> > it can't stop the message queue from building up.
> >
> > This has disastrous effects when the input file is large enough.
> >
> > I realise that the feature is undocumented, but perhaps it's still
> > a valid point - some sort of generic flow-control on ports,
> > similar to the {active, bool()} on sockets, would be just the
> > thing here.
> >
> > (*) I realise that I tested it in an NFS-mounted disk (on a clearcase-
> > enabled file system at that). This might have given the port
> > sufficient flow control that the program lasted a bit longer, at least.
> >
> > _______________________________________________
> > erlang-questions mailing list
> > 
> > http://www.erlang.org/mailman/listinfo/erlang-questions
> >
>
>
> _______________________________________________
> erlang-questions mailing list
> 
> http://www.erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20070713/f899d6c9/attachment.html>


More information about the erlang-questions mailing list