[erlang-questions] Understanding supervisor / start_link behaviour

Steve Strong steve@REDACTED
Thu Jun 2 16:23:40 CEST 2011


That is an interesting point, and not something I'd considered to date

-- 
Steve Strong
Sent with Sparrow (http://www.sparrowmailapp.com)

On Thursday, June 2, 2011 at 4:10 PM, Frédéric Trottier-Hébert wrote:

> There are disadvantages to *not* putting workers under the supervision tree, though. Namely, you'll be losing the ability to have the release handlers walk down the supervision trees to find which processes to suspend/update, and you'll then need to find a different way of doing things.
> 
> This is a serious point to consider if you ever plan on going the way of releases/appups if the workers you use are to be long-lived (you don't want them to be killed during a purge). I'm not saying you didn't know this, but I felt I should point it out for the sake of having the arguments clear on the mailing list. 
> 
> --
> Fred Hébert
> http://www.erlang-solutions.com
> 
> 
> On 2011-06-02, at 05:53 AM, Mazen Harake wrote:
> 
> > Steve,
> > 
> > I wouldn't say that you are wrong. I think that you are reasoning good
> > about not putting the gen_event module under a supervisor because
> > *that is what links are for*. Just because you have a supervisor
> > doesn't mean the you shove everything underneath there! If the
> > gen_server and the gen_event are truly linked (meaning: gen_server
> > doesn't act as a "supervisor" keeping track of its gen_event process
> > and restarts it all the time but rather that they really are linked
> > and they crash together) then your approach, in my opinion, is good.
> > 
> > There are great benefits in doing it in that way. Many will claim that
> > it is best practice to put *everything* under a supervisor but this is
> > simply not true. 90% of cases it *is* the best thing to do and many
> > times it is more about how you designed your application rather than
> > where to put the supervisors and their children but doing it the way
> > you did is not necessarily wrong.
> > 
> > The only problem I see with your approach is that you have registered
> > the gen_event process which clearly isn't useful (since only the
> > gen_server should know about it, after all, it started it). Other than
> > that, this approach is extremely helpful and a nice way to clean up
> > things after they die/shutdown (Again: assuming truly linked).
> > 
> > There is a big misconception in the community that everything
> > should/must look like the supervisor-tree model which shows how
> > gen_servers are put under supervisors and more supervisors under the
> > "top" supervisor but that is not enforced and the design principles
> > doesn't take many cases into account where this setup actually brings
> > more headache to the table than to just exit and clean up using linked
> > processes (because they do exist).
> > 
> > /M
> > 
> > On 1 June 2011 21:26, Steve Strong <steve@REDACTED (mailto:steve@REDACTED)> wrote:
> > > Hi,
> > > 
> > > I've got some strange behaviour with gen_event within a supervision tree
> > > which I don't fully understand. Consider the following supervisor
> > > (completely standard, feel free to skip over):
> > > <snip>
> > > -module(sup).
> > > -behaviour(supervisor).
> > > -export([start_link/0, init/1]).
> > > -define(SERVER, ?MODULE).
> > > start_link() ->
> > >  supervisor:start_link({local, ?SERVER}, ?MODULE, []).
> > > init([]) ->
> > >  Child1 = {child, {child, start_link, []}, permanent, 2000, worker,
> > > [child]},
> > >  {ok, {{one_for_all, 1000, 3600}, [Child1]}}.
> > > </snip>
> > > and corresponding gen_server (interesting code in bold):
> > > <snip>
> > > -module(child).
> > > -behaviour(gen_server).
> > > -export([start_link/0, init/1, handle_call/3, handle_cast/2,
> > > handle_info/2, terminate/2, code_change/3]).
> > > start_link() ->
> > >  gen_server:start_link({local, child}, child, [], []).
> > > init([]) ->
> > >  io:format("about to start gen_event~n"),
> > >  X = gen_event:start_link({local, my_gen_event}),
> > >  io:format("gen_event started with ~p~n", [X]),
> > >  {ok, _Pid} = X,
> > >  {ok, {}, 2000}.
> > > handle_call(_Request, _From, State) ->
> > >  {reply, ok, State}.
> > > handle_cast(_Msg, State) ->
> > >  {noreply, State}.
> > > handle_info(_Info, State) ->
> > >  io:format("about to crash...~n"),
> > >  1 = 2,
> > >  {noreply, State}.
> > > terminate(_Reason, _State) ->
> > >  ok.
> > > code_change(_OldVsn, State, _Extra) ->
> > >  {ok, State}.
> > > </snip>
> > > If I run this from an erl shell like this:
> > > <snip>
> > > --> erl
> > > Erlang R14B01 (erts-5.8.2) [source] [64-bit] [smp:2:2] [rq:2]
> > > [async-threads:0] [hipe] [kernel-poll:false]
> > > Eshell V5.8.2 (abort with ^G)
> > > 1> application:start(sasl), supervisor:start_link(sup, []).
> > > </snip>
> > > 
> > > Then the supervisor & server start as expected. After 2 seconds the server
> > > gets a timeout message and crashes itself; the supervisor obviously spots
> > > this and restarts it. Within the init of the gen_server, it also does a
> > > start_link on a gen_event process. By my understanding, whenever the
> > > gen_server process exits, the gen_event will also be terminated.
> > > However, every now and then I see the following output (a ton of sasl trace
> > > omitted for clarity!):
> > > <snip>
> > > about to crash...
> > > about to start gen_event
> > > gen_event started with {error,{already_started,<0.79.0>}}
> > > about to start gen_event
> > > gen_event started with {error,{already_started,<0.79.0>}}
> > > about to start gen_event
> > > </snip>
> > > What is happening is that the gen_server is crashing but on its restart the
> > > gen_event process is still running - hence the gen_server fails in its init
> > > and gets restarted again. Sometimes this loop clears after a few
> > > iterations, other times it can continue until the parent supervisor gives
> > > up, packs its bags and goes home.
> > > So, my question is whether this is expected behaviour or not. I assume that
> > > the termination of the linked child is happening asynchronously, and that
> > > the supervisor is hence restarting its children before things have cleaned
> > > up correctly - is that correct?
> > > I can fix this particular scenario by trapping exits within the gen_server,
> > > and then calling gen_event:stop within the terminate. Is this type of
> > > processing necessary whenever a process is start_link'ed within a supervisor
> > > tree, or is what I'm doing considered bad practice?
> > > Thanks for your time,
> > > Steve
> > > --
> > > Steve Strong, Director, id3as
> > > twitter.com/srstrong (http://twitter.com/srstrong)
> > > 
> > > 
> > > _______________________________________________
> > > erlang-questions mailing list
> > > erlang-questions@REDACTED (mailto:erlang-questions@REDACTED)
> > > http://erlang.org/mailman/listinfo/erlang-questions
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED (mailto:erlang-questions@REDACTED)
> > http://erlang.org/mailman/listinfo/erlang-questions

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110602/724a29b2/attachment.htm>


More information about the erlang-questions mailing list