[erlang-questions] "New" vs. "old" console behavior: bug or feature?
Robert Virding
robert.virding@REDACTED
Wed Apr 24 20:05:38 CEST 2013
Strange because both user.erl and group.erl "should" be able to handle output requests in the middle of getting input. But it is a little difficult to see in group as there is all this tricky search code. :-)
Robert
----- Original Message -----
> From: "Fred Hebert" <mononcqc@REDACTED>
> To: "Scott Lystig Fritchie" <fritchie@REDACTED>
> Cc: erlang-questions@REDACTED
> Sent: Wednesday, 24 April, 2013 9:46:02 AM
> Subject: Re: [erlang-questions] "New" vs. "old" console behavior: bug or feature?
>
> Hi Scott,
>
> The IO world of Erlang is a fun crazy thing :)
>
> I've spent time trying to document how the shell works back at
> http://ferd.ca/repl-a-bit-more-and-less-than-that.html. I'll do a
> quick
> roundup of things just to be clear on everything.
>
> Before going into the difference between 'new' and 'old' shells,
> there
> is a 'user' process, which you mentioned, part of the IO system. The
> 'user' process acts as a default top-level group leader for all the
> output coming from a process. All group leaders are inherited from
> the
> process' parent. They can also be modified, so that you may have
> different group leaders across a VM: they are local processes,
> middle-men (like application_controller), or remote processes (this
> is
> how RPC calls get printed to everyone any time).
>
> By default, every OTP app will put its controller as a group leader
> for
> all sub-processes. This group leader will redirect output, but
> overload
> the feature to kill rogue processes on shutdown (it makes a list of
> all
> processes, inspects their group leader, and if it's the current app's
> pid, kills said process). Other tools like eunit and Common Test will
> have the possibility of injecting themselves above test cases and
> pick
> what to print or not. By sending IO directly to 'user', we bypass
> that
> hierarchy and go straight to the node's main IO process. Other
> special
> cases can be used, such as 'standard_error', which will redirect
> output
> to the error channel.
>
> That being said, there are two default implementations of a process
> that
> registers itself as 'user' on a node: the new (current) shell, and
> the
> 'old' shell. The choice of which one to pick is determined at boot
> time
> by the user_sup.erl module (part of kernel) through system flags:
>
> - If the node is a slave node, the 'user' module will point to a
> remote
> process.
> - If the node is started with no special flag, the new shell is
> started
> through 'user_drv'. This 'user' proc will act as a middle-man
> between
> input and output with a tty program and the different Erlang groups
> (see group.erl in kernel) to allow multiple jobs and concurrent
> shells
> without messed up output. Evaluation is handled by shell.erl
> (stdlib)
> - If the node is started with the -oldshell flag, the process in
> charge
> is 'user.erl', which uses special IO devices ({fd,0,1} for IO) to
> deal
> with the input and output channels for the node directly. It will
> send
> the evaluation to shell.erl also.
> - If the node is started with -noshell, the 'user.erl' module is
> still
> booted, but will not evaluate any input nor forward it.
> - If the node is started in -noinput mode, the 'user.erl' module is
> still booted, but it will not forward any input, only output from
> the
> node. It's a superset of -noshell and a bit safer because it opens
> the
> IO port in a way that only has the 'out' channel open.
> - There is an undocumented -nouser flag. Such a flag makes sure that
> neither user.erl nor user_drv.erl are started. The node will crash
> unless you specifically decide to start a process that registers
> itself as 'user' and decides to handle IO for your node. This is
> what
> you should use were you planning to provide your own Erlang shell
> and
> boot it as 'erl -nouser -s custom_shell'.
> - If it's not possible to boot the tty used by 'user_drv', it should
> fall-back to 'user.erl' as an IO leader.
>
> Alright. That covers most of it for the basics.
>
> To figure out why it blocks, we need to figure out the evaluation.
> The
> evaluation itself happens in a shell.erl process, which does an io
> request to the 'user' process (technically, its own group_leader, so
> that anyone may use the evaluator where they want. It just happens to
> be
> the 'user' process in this case).
>
> Input --> user.erl <---> shell.erl
>
> The shell does an io-request to user, which asks to read characters.
> The user.erl process forwards that data to the shell. The shell
> attempts to evaluate it, and if there's not enough data, it asks for
> more. user.erl then blocks until it can get more data to respond to
> the
> io request.
>
> When output is sent to 'user' it's sent as an additional io request,
> as
> a message. This message will not be read until the shell can answer
> the
> previous request. This is where you block.
>
> Input --> user.erl <---> shell.erl
> ^----> other proc
>
> The new shell does things differently by using a 'group.erl' process
> for
> each IO group. Now each group.erl process has the same potential to
> block, with the exception that user_drv.erl will start one very
> specific
> 'group.erl' process to be 'user', and will not return it as a
> potential
> shell.erl input source (it would be 0 in '^G -> j', and it is not
> possible to select it). user_drv will also consider it to be a
> special
> group that can *always* output to tty, wheras other groups will only
> have their output dumped by default if they're not the currently
> active
> one (hence you do not get other shells' output by default when you
> switch tasks). This means that while you could block things by
> finding
> the specific 'group.erl' you're currently sending IO requests to by
> default, it's unlikely to happen by accident, and 'user' is now a
> safe
> process to send IO requests to.
>
> I hope this explains things. I would find it difficult to call it a
> bug
> given a solution exists to the problem already, but I do see why the
> fallback to the old shell when no tty is available could be
> problematic.
> I'm guessing it would be possible to make a 'raw shell', which does
> tasks similar to user_drv, but using a user.erl-like adapter instead
> of
> a tty program to communicate with and starting it with 'erl -nouser
> -s
> rawshell' or something, or eventually making it the default user_drv
> falls back to instead of 'user:start()'. I'm guessing this would be a
> very low priority for the OTP team, though.
>
> I hope this lengthy response answers your questions!
>
> Regards,
> Fred.
>
> On 04/23, Scott Lystig Fritchie wrote:
> > Hi, all. I can't figure out if this message should be sent to the
> > erlang-bugs list or the erlang-questions list ... so I'll go for
> > the
> > more general audience.
> >
> > Summary: Starting Erlang with a tty/pseudo-tty can get you a
> > different
> > console shell ("new" and "old", respectively) without you realizing
> > it.(*) If you don't know that you're using the old shell, and if a
> > process tries to send output to the 'user' registered process(**),
> > e.g. io:format(user, "Some message with ~p extra\n", [Extra]), then
> > it
> > is possible that the io:format() call will not return for
> > seconds/minutes/hours/ever.
> >
> > My question: Is the kind of indefinite blocking on I/O described
> > below a
> > bug or a feature?
> >
> > I have a test case that can reproduce this behavior. An automated
> > version (using Expect) can be found at:
> >
> > https://gist.github.com/slfritchie/ad8e5cf1603cbe326be7
> >
> > The basics of the reproducing the hang are:
> >
> > SSH session #1 SSH session #2
> > -------------- --------------
> > Start an Erlang daemon
> > using "run_erl".
> >
> > Attach to the daemon's console
> > using "to_erl".
> >
> > Start another Erlang VM
> > and connect to the first
> > VM via "-remsh".
> >
> > At the console, type the
> > following and press ENTER:
> > {term1,
> >
> > Run this command:
> > io:format(user,
> > "Hey!\n", []).
> >
> > The io:format/3 call in session #2 will behave differently if
> > session
> > #1's "run_erl" command runs with a tty/pseudo-tty or without.
> >
> > A. With a tty/pty: The io:format() call returns immediately.
> > B. Without a tty/pty: The io:format() call will hang
> > indefinitely.
> > It will remain blocked until the Erlang term parser in
> > session #1
> > has returned. For example, finishing the term with
> > "term2}." and
> > then pressing ENTER.
> >
> > The same effect can be seen by forcing the use of the old shell,
> > without
> > using SSH, by simply running "erl -oldshell" for session #1 (in an
> > Xterm
> > or other terminal window, or at the machine's hardware console)
> > instead
> > of using SSH + "run_erl" + "to_erl".
> >
> > Riak was the application that triggered this bug hunt (in
> > conjunction
> > with the Lager app)(***). Finding it has taken much longer than
> > anyone
> > guessed. The reason is that the necessary precondition, starting
> > Erlang
> > via 'run_erl' via SSH without an associated tty/pseudo-tty, is not
> > common. (Riak's packaging uses "sudo", which refuses to run if
> > there
> > isn't a tty/pty available.)
> >
> > All attempts to duplicate the behavior failed because we didn't
> > understand that the root cause of the bad behavior was the old
> > console
> > being silently chosen at VM startup when not tty/pty is available.
> >
> > -Scott
> >
> > (*) See
> > https://github.com/erlang/otp/blob/maint/lib/kernel/src/user_drv.erl#L103
> > for how the choice is made.
> >
> > (**) From the 'io' man page:
> >
> > There is always a process registered under the name of user.
> > This
> > can be used for sending output to the user.
> >
> > ... where "output to the user" really means "output to the Erlang
> > virtual machine console."
> >
> > (***) For source code of Riak and Lager, respectively, see:
> > https://github.com/basho/riak
> > https://github.com/basho/lager
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://erlang.org/mailman/listinfo/erlang-questions
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
More information about the erlang-questions
mailing list