[erlang-questions] input too fast

Fri Jun 29 21:04:55 CEST 2007

"Fredrik Svahn" <fredrik.svahn@REDACTED> wrote:
>
>Sorry for double posting, it seems I have misconfigured something at
>trapexit...

Hm, it seems lately those of us that actually use mail to read the
mailing list are missing lots of posts (so are the archives)...

>Just for fun I made a small patch to efile to allow for stdin/stdout to be
>opened as files. It probably has a lot of nasty side effects which I cannot
>even imagine in the worst of my nightmares, but the results for reading and
>writing are stunning. The "file" approach clocks in at 0.26 seconds for
>reading a large file from stdin and writing it to stdout. Corresponding
>results for a port program is 1.2 seconds with the normal io approach
>measuring in at 16.9 seconds.

Great test! I was quite intrigued as to why the file version was so much
faster than the port one, since they do almost exactly the same thing -
{fd, 0, 1} doesn't involve a "port program", only tells the VM to use
those file descriptors, just like your mod to the file driver - until I
noticed that you had given 'binary' to file:open/2 but not to
open_port/2. With that fixed, they're on a more equal footing - actually
in my tests the port version is almost 3 times faster than file on Linux
(about equal on FreeBSD), probably due to reading/writing 64k chunks
rather than 2k (you can figure out what that made the lack of 'binary'
do to the port version:-).

>I haven't looked at memory consumption, yet, but I expect the result should
>be the same as for Ulf, i.e. port programs build up large heaps if they
>cannot handle the messages really really really fast, while the file and
>normal io approach should not really consume much more memory than the
>buffer.

I didn't see any lasting memory buildup in this test, with or without
'binary', but this just goes to show that "really really really fast"
can be somewhat quantified, not that the port version can in general
handle arbitrarily large files: In Ulf's case, using 'line' mode
probably turned the 64k chunk into more than 1000 line-list messages,
and it may well be that just processing those with zero "useful" work
exceeded the number of reductions that the VM will do before polling for
input again - not so in this test, where each chunk resulted in a single
message.

--Per Hedeland