[erlang-questions] Issues with stdin on ports

Per Hedeland per@REDACTED
Wed Jul 31 18:55:41 CEST 2013


"Richard A. O'Keefe" <ok@REDACTED> wrote:
>
>Whether the pipes in your underlying UNIX are bidirectional
>or not, whether your popen() supports a "r+" argument or not,
>bidirectional pipe-based communication between a parent
>process and a child process is safe *IF*
>
>(1) the child process first reads all the data without writing
>    and some time after receiving EOF writes all its results
>    without reading, *OR*

Agreed - this is sufficient but not necessary.

>(2) both programs have been specially written to communicate
>    in this way, possibly using asynchronous I/O, non-blocking
>    I/O, or multiple threads in some way, *AND*
>    they are aware of the value of PIPE_BUF that applies to
>    the system they are running on.

No. It is sufficient that one of the programs (i.e. in this case the
Erlang VM) uses async/non-blocking I/O or threads. If this is the case,
the only requirement on the other program (i.e. in this case the one
started by open_port(spawn)) is that it reads its stdin (or whatever
file descriptor represents its end of the "out" pipe), and handles EOF
when doing so. Neither program needs to know anything about PIPE_BUF.

>(3) pipes let you report end-of-file by closing them, but no
>    other signal.

Yes! And *this* alone is what the original request in this thread is
about.

>    In order to signal a program at the other
>    end of a pipe, you need to know the process ID of the
>    process, which for the equivalent of popen("foobar",...)
>    is not trivially found.

True, but not relevant to the original request. However since the issue
has been mentioned in some other comments: open_port(spawn) obviously
doesn't use popen(), and it carefully records the pid of the process
that it forks. In modern OTP versions (as of R15B02 I believe, based on
a patch from Matthias Lang) this pid is also available to Erlang code
through erlang:port_info(Port, os_pid) - i.e. sending it an arbitrary
POSIX signal is just an os:cmd/1 away.

>(4) To send attention signals of some sort, you need to use
>    pty(4)s.

I assume that by "attention signals" (not a term I have seen in this
context) you mean "signals generated by sending a specific character on
the communication channel". If so, true, for this and many other things
you need a pty - but irrelevant.

>Per Hedeland wrote: "Erlang open_port/2 *by default* creates
>a bi-directional, deadlock-free communication channel to the
>external process."  That's still not quite right.  The thing
>that governs whether a deadlock is possible or not is the
>*protocol* the communicating processes use.  And it is still
>the case that an Erlang process may be deadlocked this way;
>it's just that the whole Erlang system won't be tied up if it
>happens.

No, the statement is quite correct, see above. The worst thing that can
happen is that the external process doesn't fulfill its obligation to
read its stdin, in which case the Erlang process may *block*. This is
not a deadlock, in fact it is a feature in normal usage, where the
external process *intermittently* doesn't read its stdin due to being
busy processing the data it has received.

But you're still re-iterating issues that, if they were genuine, would
be with the open_port(spawn) functionality *that already exists*. No-one
is suggesting that *it* be implemented, since it already is. And you can
take advantage of that fact to refute your own assertions by simply
using it, instead of posting them here. Here's a little something to get
you started:

1> P = open_port({spawn, "/bin/sh"}, []).
#Port<0.504>
2> P ! {self(), {command, "echo " ++ lists:duplicate(100000, $a) ++ "\n"}}, ok.
ok
3> Rec = fun (F, Port, Acc) -> receive {Port, {data, Data}} -> F(F, Port, Acc ++ Data) after 0 -> Acc end end.
#Fun<erl_eval.18.82930912>
4> Got = Rec(Rec, P, []), ok.
ok
5> length(Got).
100001

This was run on a Linux system where (according to its documentation)
PIPE_BUF is 4096 bytes, and the capacity of a pipe (which is something
else) is 65536 bytes.

--Per



More information about the erlang-questions mailing list