[erlang-questions] tcp connections dropped in gen_server

Ladislav Lenart lenartlad@REDACTED
Tue Sep 6 13:15:32 CEST 2011


Hello.

On 5.9.2011 23:40, Reynaldo Baquerizo wrote:
>> [snip]
>>
>> Also, if I understand the code correctly, the newly created connection
>> processes (acceptors) are not supervised. To prevent future problems
>> with this I strongly recommend you to modify your code slightly as
>> suggested in the book "Erlang and OTP in action".
>
> I didn't feel the need to supervised those connections. I fail to see
> the difference between leaving them unattended and simple_one_for_one
> with no restart.

There was a comment tcp_sup:init/1 saying "tune for production demands" to
give you a hint to change the restart strategy according to your needs. The
advantage of having the connection processes supervised (with suitable restart
strategy) is that the supervisor should log all unexpected crashes of the
connection processes. It is also nice to look at appmon and see where the
processes belong.


> Thanks for the feedback !

You're welcome :-)


NOTE (to anyone who should later read this thread): I've made a few
mistakes in my previous code sample:
  * The supervisor should start its first child (acceptor) as part of
    its own init (hidden behind tcp:start_link/1 API call. Otherwise
    noone will be listening on the ListenSocket.
  * tcp_srv:start_link/2 should return {ok, pid()} instead of pid() to
    adhere to common expectations for start_link/X functions.

In module tcp_sup:
     start_link({port, Port}) ->
         {ok, ListenSocket} = gen_tcp:listen(Port, [binary, {packet, 0}, {reuseaddr, true}, {active, true}]),
         start_link({listen_socket, ListenSocket});
     start_link({listen_socket, ListenSocket}) ->
         {ok, SupPid} = supervisor:start_link(?MODULE, [ListenSocket]),
         {ok, _Pid} = tcp_sup:startChild(SupPid),    % <-- Start the first acceptor.
         {ok, SupPid}.

In module tcp_srv:
      start_link(SupPid, ListenSocket) ->
          {ok, proc_lib:spawn_link(?MODULE, acceptor, [SupPid, ListenSocket])}.


Ladislav Lenart


>> NOTE: I haven't even attempted to compile the following code (taken
>> from the book and adapted to your use case).
>>
>> Modified process structure:
>>     simple_one_for_one - one for each ListenSocket
>>         loop - one for each existing TCP connection on the ListenSocket
>>         acceptor - one on the ListenSocket
>>
>>
>> %%%%%%%%%%%%%%%%%%%%%%
>> %%% TCP supervisor %%%
>> %%%%%%%%%%%%%%%%%%%%%%
>> -module(tcp_sup).
>>
>> -behaviour(supervisor).
>>
>> -export([start_link/1, start_child/1]).
>> -export([init/1]).
>>
>> start_link({port, Port}) ->
>>     {ok, ListenSocket} = gen_tcp:listen(Port, [binary, {packet, 0},
>> {reuseaddr, true}, {active, true}]),
>>     start_link({listen_socket, ListenSocket});
>> start_link({listen_socket, ListenSocket}) ->
>>     supervisor:start_link(?MODULE, [ListenSocket]).
>>
>> start_child(SupPid) ->
>>     supervisor:start_child(SupPid, []).
>>
>> init([ListenSocket]) ->
>>     Server = {tcp_srv, {tcp_srv, start_link, [self(), ListenSocket]},
>>               temporary, brutal_kill, worker, [tcp_srv]},
>>     RestartStrategy = {simple_one_for_one, 0, 1},    %<-- tune for
>> production demands
>>     {ok, {RestartStrategy, [Server]}}.
>>
>>
>> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>> %%% TCP server (aceptor + loop for one TCP connection %%%
>> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>> -module(tcp_srv).
>>
>> -export([start_link/2]).
>> -export([acceptor/2]).
>>
>> start_link(SupPid, ListenSocket) ->
>>     proc_lib:spawn_link(?MODULE, acceptor, [SupPid, ListenSocket]).
>>
>> acceptor(SupPid, ListenSocket) ->
>>     {ok, Socket} = gen_tcp:accept(ListenSocket),
>>     tcp_sup:start_child(SupPid, ListenSocket),    %<-- Instruct the tcp_sup
>> SupPid to start new acceptor process.
>>     error_logger:info_msg("New connection from ~p~n", [Socket]),
>>     inet:setopts(Socket, [binary, {nodelay, true}, {active, true}]),
>>     loop(Socket).
>>
>> loop(Socket) ->
>>     %% As before.
>>
>>
>> You should also consider to introduce a flow control to limit
>> unbounded memory usage under heavy load using {active, false}
>> for ListenSocket and {active, once} for Socket:
>> 1. [in tcp_sup:start_link/1] {ok, ListenSocket} = gen_tcp:listen(Port,
>> [binary, {packet, 0}, {reuseaddr, true}, {active, false}]),
>> 2. [in tcp_srv:acceptor/2] inet:setopts(Socket, [binary, {nodelay, true},
>> {active, once}]),
>> 3. [modify tcp_srv:loop/1]
>>    loop(Socket) ->
>>        receive
>>            {tcp, Socket, Data} ->
>>                inet:setopts(Socket, [{active, once}]),    %<-- added line
>>                error_logger:info_msg("Messaged received from ~p: ~p~n",
>> [Socket, Data]),
>>                comm_lib:handle_message(Socket, Data),
>>                loop(Socket);
>>            {tcp_closed, Socket} ->
>>                error_logger:info_msg("Device at ~p disconnected~n",
>> [Socket]);
>>            _Any ->
>>                %% skip this
>>                loop(Socket)
>>        end.
>>
>>
>> HTH,
>>
>> Ladislav Lenart
>>
>>
>> On 5.9.2011 18:59, Reynaldo Baquerizo wrote:
>>>
>>> I have a running application that consist in a supervisor and two
>>> generic servers, one of them wraps around odbc and the other handles
>>> tcp connections, a fragment of the relevant code is:
>>>
>>>
>>> init([]) ->
>>>      process_flag(trap_exit, true),
>>>      {ok, ListenSocket} = gen_tcp:listen(Port, [binary, {packet, 0},
>>>
>>> {reuseaddr, true},
>>>
>>> {active, true}]),
>>>      proc_lib:spawn_link(?MODULE, acceptor, [ListenSocket])
>>>
>>> acceptor(ListenSocket) ->
>>>      {ok, Socket} = gen_tcp:accept(ListenSocket),
>>>      error_logger:info_msg("New connection from ~p~n", [Socket]),
>>>      _Pid = proc_lib:spawn(?MODULE, acceptor, [ListenSocket]),
>>>      inet:setopts(Socket, [binary, {nodelay, true}, {active, true}]),
>>>      loop(Socket).
>>>
>>> loop(Socket) ->
>>>      receive
>>>         {tcp, Socket, Data} ->
>>>         error_logger:info_msg("Messaged received from ~p: ~p~n", [Socket,
>>> Data]),
>>>             comm_lib:handle_message(Socket, Data),
>>>             loop(Socket);
>>>         {tcp_closed, Socket} ->
>>>         error_logger:info_msg("Device at ~p disconnected~n", [Socket]);
>>>         _Any ->
>>>         %% skip this
>>>             loop(Socket)
>>>      end.
>>>
>>> So, I basically start a new  unlinked process for every new tcp
>>> connection. It works just fine for a couple hours but  then every tcp
>>> connection is dropped gradually with message "Device at ~p
>>> disconnected". The client will try to reconnect if connection is
>>> closed. The tcp connection should only terminate if remote end closes
>>> it or spawned proccess in the server crashes.
>>>
>>> After all connections were dropped, I can see with inet:i() that there
>>> are established connections but no logging!
>>>
>>> Can anyone give some insight or point to the right direction to debug
>>> this?




More information about the erlang-questions mailing list