gen_tcp:connect will close when using supervisor.
zxq9
zxq9@REDACTED
Sat Nov 21 03:47:33 CET 2020
Hi, George.
Replies are inline.
On 2020/11/19 8:26, George Hope wrote:
> I have a simple gen_server which connects to a tcp port, when I run
> gen_server directly:
> erl -noshell -s ttest start
> it works fine, But when I run:
> erl -noshell -s tt_sup start_link
> the connection will be closed instantly, it will work if I unlink the
> supervisor.
> How could I run my simple gen_server with supervisor ?
It looks like what is going on below is a problem of mistaken execution
context. For example, let's look at tt_sup:start_link/0...
> -module(tt_sup).
> -behaviour(supervisor).
> -export([init/1, start_link/0, start_shell/0]).
>
> start_link() ->
> io:format("Supervisor started with PID~p~n", [self()]),
> {ok, Pid} = supervisor:start_link(tt_sup, []),
> io:format("Supervisor PID=~p~n", [Pid]),
> {ok, Pid}.
The code above is sending a message to stdout that a supervisor has
started with PID self(), but that's not the starting supervisor's PID.
That's the PID of whatever process is calling tt_sup:start_link/0. The
tt_sup's PID is the one you are receiving as a return value from calling
supervisor:start_link/2 (which you capture and also print). The trouble
here is that if the caller retires then because of the link the freshly
spawned tt_sup will also retire, and because it is linked to its worker
ttest, it will be taken down as well.
It appears the process that is kicking things off when you call `erl
-noshell` is exiting immediately after calling its target function, and
application:start* is not being used so the Erlang runtime's application
supervisor is not being used to monitor your supervisor as would
normally happen.
> start_shell() ->
> io:format("Supervisor started with PID~p~n", [self()]),
> {ok, Pid} = supervisor:start_link(tt_sup, []),
> io:format("Supervisor PID=~p~n", [Pid]),
> unlink(Pid),
> {ok, Pid}.
This version does the same thing, but the final action taken before
returning the PID is to unlink. Remember, this is the caller unlinking
from the freshly spawned supervisor, not the supervisor unlinking its
child -- and recall that the caller in this case is the temporary
process spawned by `erl -noshell`, not a long-lived application supervisor.
So in both cases your ttest *worker* actually is under supervision, the
only question is whether your supervisor is linked to whatever called it.
In a normal application you want an application supervisor to be
starting your top level supervisor and it should be linked to it, so
supervisor:start_link/* is the appropriate thing to call (an in fact
there are no supervisor:start_monitor or supervisor:start functions
because it is expected that supervisors will be written in the context
of OTP compliant applications). In simple command line execution (via
escript or another utility) it is a little more gray whether or not you
really care about having it be supervised.
One way to make this work from the command line is to either make it an
escript, or wrap it up as an OTP application and have a utility launch
it that provides a full execution context. There are release builders
like rebar3 that can do this, or I have a project that provides a more
dynamic execution environment that makes writing and executing Erlang
feel a little more like working with Python.
Writing full-blown Erlang applications involves a little bit of extra
scaffolding and a touch of boilerplate to get started, but the benefits
are immense, so it's worth it.
[WARNING: A shameless plug for my own project follows...]
You might find this useful: https://zxq9.com/projects/zomp/
Here is a video talk-through of using it to build a chat server:
https://www.youtube.com/watch?v=yyM4N8cuau0
Using that I do `zx create project` and follow the (somewhat overly
verbose) prompts. Select "CLI application" (if you just want it run
similar to a script) or "Traditional Erlang application" if you want it
to be supervised like a normal long-lived Erlang application. ZX will
template a project for you of either style.
The default CLI application is basically a "hello, world" that runs in a
full execution environment, and the default "Erlang application"
template is a simple telnet chat/echo server (the basis for change in
the example video above). You can modify either one to do what you want.
There are comments in the templated source files that explain what all
the pieces do.
Running the application from the project directory is `zx runlocal` or
running it from anywhere else is `zx rundir [path to project]`. ZX will
build or rebuild whatever changes you've made automatically.
The most important thing you can take away from the above is to learn
how OTP applications are structured so that you can live in that world
comfortably and not be confused. It doesn't matter whether you use ZX,
rebar3, erlang.mk or run your projects by hand as you were above, the
point is to grok what is happening and "what process is running the
piece of code I'm looking at right now?"
Hopefully this explains more than it confuses.
Have fun making stuff!
-Craig
More information about the erlang-questions
mailing list