[erlang-questions] Issues with stdin on ports

Tue Jul 30 07:44:34 CEST 2013

On 29/07/2013, at 8:20 PM, Anthony Grimes wrote:

> Yeah, re-reading your post a couple of times, I think we might be on the wrong page or something. Here is a low level example of what I'd like to be able to do in Erlang. This is a Clojure repl session where I interact with the 'cat' program via the built in Java Process library:
> 
> user=> (def proc (.exec (Runtime/getRuntime) (into-array String ["cat"])))
> #'user/proc
> user=> (def stdin (.getOutputStream proc))
> #'user/stdin
> user=> (def stdout (.getInputStream proc))

I have some trouble reading Clojure.  I don't know what the dots are.
Hazarding a guess,

	This is *PRECISELY* the "Hello, deadlock!"
	kind of buggy stuff that the C interface was designed
	to *not* let you write.

> Lots of unix programs work like this.
> We have cat in this example, but grep, wc, and various others work like that as well. It is this easy or easier to do the same thing in every other language I can think of.

Actually, NO.  You are talking about "filters" here,
and filters are designed to be connected into ***ACYCLIC*** networks.

> If it's fundamentally a bad thing, I'm surprised these programs work like that in the first place and that these languages support this.

The programs do NOT work the way you think they do.
A filter reads from its standard input.
It writes to its standard output.
If it could have emotions, it would view the prospect
of those two being the *same* thing with shuddering dread.
(Except of course, when the thing is the terminal.  The
user is assumed to be capable of infinite buffering.)

Erlang is perfectly happy to be connected to an ACYCLIC network
of pipe-linked processes too.

> It seems to be an entirely common place, basic feature any remotely high level programming language.

Actually, no.  The ability to connect to the standard input *AND* the standard
output of the *same* process is *not* a commonplace feature of high level
programming languages (some do, some don't) because unless you code with
extreme (and to a certain extent, non-portable) care, you end up in deadlock land.

Only if one of the programs is absolutely guaranteed to write a tiny
amount of information -- at most one PIPE_BUF worth, do you have
any shadow of a trace of a right to expect it to work.

If you don't believe me, believe the Java documentation,
where the page for java.lang.Process says

	All [the new process's] standard io (i.e. stdin, stdout, stderr)
	operations will be redirected to the parent process through
	three streams (getOutputStream(), getInputStream(),
	getErrorStream()).  The parent process uses these streams to feed input to and get 	output from the subprocess.
>>>>>>	Because some native platforms only provide limited buffer size
>>>>>>	for standard input and output streams, failure to promptly
>>>>>>	write the input stream or read the output stream of the
>>>>>>	subprocess may cause the subprocess to block, and even deadlock.

The POSIX guarantee for PIPE_BUF is just 512 bytes.
That is, should the parent process write 513 bytes to the child,
and the child write 513 bytes to the parent,
hello deadlock!

Like I said, connecting to *both* ends of a command through pipes
is something to anticipate with shuddering dread.  It is *not* a
standard feature to be used lightly.

I can't find anything about external processes in the Haskell 2010
report.  System.Process
http://www.haskell.org/ghc/docs/7.4-latest/html/libraries/process-1.1.0.1/System-Process.html
isn't mentioned in Haskell 2010.  I am actually pretty shocked that
the documentation doesn't mention the deadlock problem.