emfile error

Patrik Winroth pew@REDACTED
Tue Dec 4 10:19:49 CET 2001


It is indeed important to keep track of resources, no matter how many file
descriptors you have available, they *will* run out some day.

Crashes due to e.g. Mnesia shortage of FDs are not a pretty thing :-)

This would be very nice functionality to add to the OTP platform.

I.e. keeping track of file descriptors in a manner similar to the below
(uncomplete) example.

Regards,

/Patrik.


(fe@REDACTED)1> fd_keeper:max_and_reserved().
{16300,24}

%%%
%% file descriptor management.
%%%
-export([stop_accepting/0,
	 start_accepting/0,
	 node_wants_alarm/0,
	 all_slaves_out_fds/0,
	 max_and_reserved/0,
	 how_many_reserved/0,
	 how_many_reserved_total/0,
	 reserve_fds/1,
	 unconditional_reserve_fds/1,
	 free_fds/1,
	 free_all_fds/0]).

%%%
%% node_wants_alarm(..) - used to check if the node wants a alarm
%%%
node_wants_alarm() ->
    case catch ets:lookup(?NODE_SESSION_TABLE, ?ALARM) of
	[{_,B}] ->
	    B;
	_ ->
	    false
    end.

%%%
%% all_slaves_alarm(..) - used to check if the node wants a alarm
%%%
all_slaves_out_fds() ->
    case catch ets:lookup(?NODE_SESSION_TABLE, ?ALL_SLAVES_ALARM) of
	[{_,B}] ->
	    B;
	_ ->
	    false
    end.

%%%
%% Use how_many_reserved_total(..) to see how many file descriptors
%% are reserved on the master and slaves summed.
%%%
how_many_reserved_total() ->
    lists:foldl(
      fun(N, Sum) ->
	      Sum + fdapi:call_in_slave(N,?MODULE,how_many_reserved,[])
      end, how_many_reserved(), fdapi:slave_nodes()).

%%%
%% Use how_many_reserved(..) to see how many file descriptors
%% are reserved.
%%%
how_many_reserved() ->
    case catch ets:lookup(?NODE_SESSION_TABLE, ?CNT) of
	[{_, N}] ->
	    N;
	_ ->
	    0
    end.

%%%
%% Use reserve_fds(..) to reserve file descriptors - if it returns true
%% the reserved fds may be used, if false they may not be used.
%% The same process that wants to use the descriptors must also
%% reserve them.
%% Update: if the node is shutting down, and no new sessions may be
%% started, ?ACCEPTING is false.
%%%
reserve_fds(N) ->
    case catch ets:lookup(?NODE_SESSION_TABLE, ?ACCEPTING) of
	[{_, true}] ->
	    N1 = ets:update_counter(?NODE_SESSION_TABLE, ?CNT, N),
	    case lookup_maxfds() of
		Max when Max >= N1 ->
		    make_reservation(N);
		_ ->
		    ets:insert(?NODE_SESSION_TABLE, {?ALARM, true}),
		    ets:update_counter(?NODE_SESSION_TABLE, ?CNT, -N),
		    false
	    end;
	_ ->
	    false
    end.

%%%
%% unconditional_reserve_fds(..) may only be used by the inet_server, when
%% starting gateways (i.e. no session threads may use it).
%%%
unconditional_reserve_fds(N) ->
    ets:update_counter(?NODE_SESSION_TABLE, ?CNT, N),
    make_reservation(N).

make_reservation(N) ->
    Pid = self(),
    case ets:lookup(?NODE_SESSION_TABLE, Pid) of
	[] ->
	    ets:insert(?NODE_SESSION_TABLE, {Pid, N});
	[_] ->
	    ets:update_counter(?NODE_SESSION_TABLE, Pid, N)
    end,
    erlang:link(whereis(?NODE_SESSION_SERVER)),
    true.

%%%
%% free_fds(..) is used to free previously reserved fds.
%% *Must* be called from the same process that reserved them.
%%%
free_fds(N) ->
    free_fds(N, self()).

%%%
%% free_fds(..) internal function.
%%%
free_fds(N, Pid) ->
    ets:update_counter(?NODE_SESSION_TABLE, ?CNT, -N),
    case catch ets:update_counter(?NODE_SESSION_TABLE, Pid, -N) of
	N1 when N1 =< 0 ->
	    ets:delete(?NODE_SESSION_TABLE, Pid),
	    %% This unlink may be executed in is_node_servers
	    %% context, but that is ok.
	    erlang:unlink(whereis(?NODE_SESSION_SERVER)),
	    ok;
	_ ->
	    ok
    end.

%%%
%% free_all_fds(..) is used to free all the calling process fds.
%%%
free_all_fds() ->
    free_pids_fds(self()).

%%%
%% free_pids_fds(..) is used to free a pids all allocated fds, primarily
%% when a supervised process dies upon us.
%%%
free_pids_fds(Pid) ->
    case catch ets:lookup(?NODE_SESSION_TABLE, Pid) of
	[{_, N}] ->
	    free_fds(N, Pid);
	_ ->
	    ok
    end.

get_session_id() ->
    ets:update_counter(?NODE_SESSION_TABLE, ?SESS_CNT, 1).




On Tue, 4 Dec 2001, Alex Arnon wrote:

> You should try throttling the accept rate - set a limit ot the number of
> concurrent connections, and simply stop accepting once that is
> exhausted. You can also possibly increase the listen queue size.
> Just like memory and disk space, descriptors are a finite resource; you
> should always write your servers with limits in mind.
>
>
> -----Original Message-----
> From: Rick Pettit [mailto:rpettit@REDACTED]
> Sent: Monday, December 03, 2001 11:44 PM
> To: erlang-questions@REDACTED
> Subject: emfile error
>
>
>
> I have an erlang process which runs as a concurrent TCP server.  It
> functions as a gateway between non-erlang processes and Erlang
> processes.
>
> The server listens on a well-known TCP port and spawns a child to handle
> each request.  The child will parse TCP input and forward the request
> via
> Erlang messaging to the appropriate Erlang server, waiting for a
> response
> (another Erlang message) before responding over TCP and closing the TCP
> connection.
>
> The problem is that when the server is hit hard the clients receive an
> emfile error, which I understand could be the result of a UNIX process
> (the erlang node, in this case) running out of descriptors.
>
> Is this a known problem with concurrent erlang servers?  I would not
> expect this same error if my server was in C and it fork()'d children,
> as
> each child would then have its own descriptor table and would have very
> few entries in it (stdin, stdout, stderr, TCP client socket).
>
> I wonder if the entire node appears to the host OS as a single process
> (perhaps one with many threads), in which case I would expect this
> problem
> with most any concurrent server.
>
> Please forgive me if I am doing something silly or am missing some
> fundemental coding convention that would have alleviated this problem.
>
> Rick
>
>

-- 
Patrik Winroth                         <pew@REDACTED>
Vindaloo AB                            mbl: 0709-727364




More information about the erlang-questions mailing list