[erlang-questions] benchmarks game harsh criticism (was Learning Erlang from the scratch)

Sat Nov 24 14:22:29 CET 2007

Per Hedeland skrev:
> "Ulf Wiger (TN/EAB)" <ulf.wiger@REDACTED> wrote:
>> When playing with the benchmark dealing with line-oriented
>> input, I experimented with the line-oriented socket option.
> 
> I think you mean port option.

Yes, of course. Thanks.

> 
>> It was fast - so fast, in fact, that the Erlang program couldn't
>> keep up, even though it ran in the tightest loop possible.
>> To solve this, we wouldn't have to make the system unsafe. We'd
>> need to implement flow control on port input, much like that
>> which already exists in the inet driver. Erlang would be better
>> for it - not worse.
>>
>> I think this is a good finding.
> 
> I'm afraid I'll have to challenge it though (I think I already did so,
> but I guess I didn't make my point very well). It says nothing at all
> about how fast the port I/O is, only that design choices in the VM when
> it comes to the relative frequency of polling for I/O vs scheduling
> processes are such that if input is always available, you really can't
> get much done in your Erlang code. I.e. it's only about the relative
> amount of processing allocated to doing I/O and running Erlang, not
> about absolute speed.
 >
 > And using a "raw" port a.k.a. one of the builtin fd/spawn drivers for
 > reading from a disk file is rather "silly" - it's nice because it
 > allows
 > (or can allow) for the "everything is a file" concept, and work
 > indpendent of whether the I/O channel refers to an actual file or to a
 > pipe/FIFO/socket, but it means that you keep calling poll() to get an
 > answer that will always be the same when the input actually is a file
 > - surely not optimal.

This is true. It was quite fast when I measured it on my workstation,
but I learned later that it only worked because the disk was slow
enough that my Erlang code could keep up. With a faster disk, the
system was overrun. If one knows that it's a plain file, it should
be possible to do even better.

It should also be said, perhaps, that this hasn't been a problem
in our products, since we usually use either inet ports or custom
linked-in drivers, and don't allow the outside world to flood our
systems. Obviously, using plain pipes is dangerous without any way
of stemming the flow.

> Flow control on fd/spawn ports could still be very useful (e.g. when
> running Erlang in a Unix pipeline, or having a third-party port program
> that just spews data at you as fast as it can) - but it's not relevant
> for actual file I/O - at least not file *input*.

Not only would it be useful - I'd even say it's necessary. (:
And in lieu of a more logical and optimal way of achieving
file input, it would also be a huge improvement over the
current implementation of io:get_line().

BR,
Ulf W