[erlang-questions] Understanding supervisor / start_link behaviour
Steve Strong
steve@REDACTED
Wed Jun 1 21:26:39 CEST 2011
Hi,
I've got some strange behaviour with gen_event within a supervision tree which I don't fully understand. Consider the following supervisor (completely standard, feel free to skip over):
<snip>
-module(sup).
-behaviour(supervisor).
-export([start_link/0, init/1]).
-define(SERVER, ?MODULE).
start_link() ->
supervisor:start_link({local, ?SERVER}, ?MODULE, []).
init([]) ->
Child1 = {child, {child, start_link, []}, permanent, 2000, worker, [child]},
{ok, {{one_for_all, 1000, 3600}, [Child1]}}.
</snip>
and corresponding gen_server (interesting code in bold):
<snip>
-module(child).
-behaviour(gen_server).
-export([start_link/0, init/1, handle_call/3, handle_cast/2,
handle_info/2, terminate/2, code_change/3]).
start_link() ->
gen_server:start_link({local, child}, child, [], []).
init([]) ->
io:format("about to start gen_event~n"),
X = gen_event:start_link({local, my_gen_event}),
io:format("gen_event started with ~p~n", [X]),
{ok, _Pid} = X,
{ok, {}, 2000}.
handle_call(_Request, _From, State) ->
{reply, ok, State}.
handle_cast(_Msg, State) ->
{noreply, State}.
handle_info(_Info, State) ->
io:format("about to crash...~n"),
1 = 2,
{noreply, State}.
terminate(_Reason, _State) ->
ok.
code_change(_OldVsn, State, _Extra) ->
{ok, State}.
</snip>
If I run this from an erl shell like this:
<snip>
--> erl
Erlang R14B01 (erts-5.8.2) [source] [64-bit] [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false]
Eshell V5.8.2 (abort with ^G)
1> application:start(sasl), supervisor:start_link(sup, []).
</snip>
Then the supervisor & server start as expected. After 2 seconds the server gets a timeout message and crashes itself; the supervisor obviously spots this and restarts it. Within the init of the gen_server, it also does a start_link on a gen_event process. By my understanding, whenever the gen_server process exits, the gen_event will also be terminated.
However, every now and then I see the following output (a ton of sasl trace omitted for clarity!):
<snip>
about to crash...
about to start gen_event
gen_event started with {error,{already_started,<0.79.0>}}
about to start gen_event
gen_event started with {error,{already_started,<0.79.0>}}
about to start gen_event
</snip>
What is happening is that the gen_server is crashing but on its restart the gen_event process is still running - hence the gen_server fails in its init and gets restarted again. Sometimes this loop clears after a few iterations, other times it can continue until the parent supervisor gives up, packs its bags and goes home.
So, my question is whether this is expected behaviour or not. I assume that the termination of the linked child is happening asynchronously, and that the supervisor is hence restarting its children before things have cleaned up correctly - is that correct?
I can fix this particular scenario by trapping exits within the gen_server, and then calling gen_event:stop within the terminate. Is this type of processing necessary whenever a process is start_link'ed within a supervisor tree, or is what I'm doing considered bad practice?
Thanks for your time,
Steve
-- Steve Strong, Director, id3as
twitter.com/srstrong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110601/413bcc10/attachment.htm>
More information about the erlang-questions
mailing list