[erlang-questions] Understanding supervisor / start_link behaviour

Steve Strong steve@REDACTED
Thu Jun 2 14:23:23 CEST 2011


That makes a good deal of sense. I guess the point that something should get promoted up to a supervision tree rather than being start-linked is when it starts getting to a complexity level such that it may have issues if multiple instances of the process are running simultaneously. At that point it stops sound like a trivial helper process and something that should be managed more actively.

Completely agree on the fact that having the gen_event register wasn't a useful thing, and that not doing so would solve the problem - that was pretty obvious as soon as I spotted the issue, this thread was more to get opinion on how things should be best structured.

-- 
Steve Strong, Director, id3as
twitter.com/srstrong

On Thursday, 2 June 2011 at 11:53, Mazen Harake wrote:

> Steve,
> 
> I wouldn't say that you are wrong. I think that you are reasoning good
> about not putting the gen_event module under a supervisor because
> *that is what links are for*. Just because you have a supervisor
> doesn't mean the you shove everything underneath there! If the
> gen_server and the gen_event are truly linked (meaning: gen_server
> doesn't act as a "supervisor" keeping track of its gen_event process
> and restarts it all the time but rather that they really are linked
> and they crash together) then your approach, in my opinion, is good.
> 
> There are great benefits in doing it in that way. Many will claim that
> it is best practice to put *everything* under a supervisor but this is
> simply not true. 90% of cases it *is* the best thing to do and many
> times it is more about how you designed your application rather than
> where to put the supervisors and their children but doing it the way
> you did is not necessarily wrong.
> 
> The only problem I see with your approach is that you have registered
> the gen_event process which clearly isn't useful (since only the
> gen_server should know about it, after all, it started it). Other than
> that, this approach is extremely helpful and a nice way to clean up
> things after they die/shutdown (Again: assuming truly linked).
> 
> There is a big misconception in the community that everything
> should/must look like the supervisor-tree model which shows how
> gen_servers are put under supervisors and more supervisors under the
> "top" supervisor but that is not enforced and the design principles
> doesn't take many cases into account where this setup actually brings
> more headache to the table than to just exit and clean up using linked
> processes (because they do exist).
> 
> /M
> 
> On 1 June 2011 21:26, Steve Strong <steve@REDACTED (mailto:steve@REDACTED)> wrote:
> > Hi,
> > 
> > I've got some strange behaviour with gen_event within a supervision tree
> > which I don't fully understand. Consider the following supervisor
> > (completely standard, feel free to skip over):
> > <snip>
> > -module(sup).
> > -behaviour(supervisor).
> > -export([start_link/0, init/1]).
> > -define(SERVER, ?MODULE).
> > start_link() ->
> >  supervisor:start_link({local, ?SERVER}, ?MODULE, []).
> > init([]) ->
> >  Child1 = {child, {child, start_link, []}, permanent, 2000, worker,
> > [child]},
> >  {ok, {{one_for_all, 1000, 3600}, [Child1]}}.
> > </snip>
> > and corresponding gen_server (interesting code in bold):
> > <snip>
> > -module(child).
> > -behaviour(gen_server).
> > -export([start_link/0, init/1, handle_call/3, handle_cast/2,
> > handle_info/2, terminate/2, code_change/3]).
> > start_link() ->
> >  gen_server:start_link({local, child}, child, [], []).
> > init([]) ->
> >  io:format("about to start gen_event~n"),
> >  X = gen_event:start_link({local, my_gen_event}),
> >  io:format("gen_event started with ~p~n", [X]),
> >  {ok, _Pid} = X,
> >  {ok, {}, 2000}.
> > handle_call(_Request, _From, State) ->
> >  {reply, ok, State}.
> > handle_cast(_Msg, State) ->
> >  {noreply, State}.
> > handle_info(_Info, State) ->
> >  io:format("about to crash...~n"),
> >  1 = 2,
> >  {noreply, State}.
> > terminate(_Reason, _State) ->
> >  ok.
> > code_change(_OldVsn, State, _Extra) ->
> >  {ok, State}.
> > </snip>
> > If I run this from an erl shell like this:
> > <snip>
> > --> erl
> > Erlang R14B01 (erts-5.8.2) [source] [64-bit] [smp:2:2] [rq:2]
> > [async-threads:0] [hipe] [kernel-poll:false]
> > Eshell V5.8.2 (abort with ^G)
> > 1> application:start(sasl), supervisor:start_link(sup, []).
> > </snip>
> > 
> > Then the supervisor & server start as expected. After 2 seconds the server
> > gets a timeout message and crashes itself; the supervisor obviously spots
> > this and restarts it. Within the init of the gen_server, it also does a
> > start_link on a gen_event process. By my understanding, whenever the
> > gen_server process exits, the gen_event will also be terminated.
> > However, every now and then I see the following output (a ton of sasl trace
> > omitted for clarity!):
> > <snip>
> > about to crash...
> > about to start gen_event
> > gen_event started with {error,{already_started,<0.79.0>}}
> > about to start gen_event
> > gen_event started with {error,{already_started,<0.79.0>}}
> > about to start gen_event
> > </snip>
> > What is happening is that the gen_server is crashing but on its restart the
> > gen_event process is still running - hence the gen_server fails in its init
> > and gets restarted again. Sometimes this loop clears after a few
> > iterations, other times it can continue until the parent supervisor gives
> > up, packs its bags and goes home.
> > So, my question is whether this is expected behaviour or not. I assume that
> > the termination of the linked child is happening asynchronously, and that
> > the supervisor is hence restarting its children before things have cleaned
> > up correctly - is that correct?
> > I can fix this particular scenario by trapping exits within the gen_server,
> > and then calling gen_event:stop within the terminate. Is this type of
> > processing necessary whenever a process is start_link'ed within a supervisor
> > tree, or is what I'm doing considered bad practice?
> > Thanks for your time,
> > Steve
> > --
> > Steve Strong, Director, id3as
> > twitter.com/srstrong (http://twitter.com/srstrong)
> > 
> > 
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED (mailto:erlang-questions@REDACTED)
> > http://erlang.org/mailman/listinfo/erlang-questions

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110602/047cc67d/attachment.htm>


More information about the erlang-questions mailing list