release_handler: cannot find top supervisor for application *

Paul Guyot pguyot@REDACTED
Thu Mar 18 15:30:48 CET 2010


Hello,

There seems to be an incompatibility between stdlib 1.16.5 and sasl 2.1.9 and earlier.
This incompatibility was introduced with OTP-8324 (github commit 88b530ea24977081020feb2123124063e58dfc12). As a consequence, one might not be able to upgrade a root supervisor in release procedures, and releases produce warnings like :

=ERROR REPORT==== 2010-03-18 08:04:50 ===
release_handler: cannot find top supervisor for application mnesia

(and it goes on for every application).

Since commit 88b530ea24977081020feb2123124063e58dfc12, sys:get_status returns the formatted state from the module's format_status/2 function. For a supervisor, it means calling gen_server:format_status/2. As a result, release_handler:get_supervisor_module1/1 fails with a bad match.

get_supervisor_module1(SupPid) ->
  {status, _Pid, {module, _Mod},
   [_PDict, _SysState, _Parent, _Dbg, Misc]} = sys:get_status(SupPid),
  [_Name, State, _Type, _Time] = Misc,					<--- bad match happens here
  %% Cannot use #supervisor_state{module = Module} = State.
  {ok, element(#supervisor_state.module, State)}.

Misc used to be a list with 4 elements before 88b530ea24977081020feb2123124063e58dfc12, it now is a list with three elements and the actual state of the supervisor is deep inside the third element.

For example, this is the result I have with mnesia (and R13B04) :

P = application_controller:get_master(mnesia),
{Root, _} = application_master:get_child(P),
sys:get_status(Root).

{status,<0.36.0>,
      {module,gen_server},
      [[{'$ancestors',[<0.35.0>]},
        {'$initial_call',{supervisor,mnesia_sup,1}}],
       running,<0.35.0>,[],
       [{header,"Status for generic server mnesia_sup"},
        {data,[{"Status",running},
               {"Parent",<0.35.0>},
               {"Logged events",[]}]},
        {data,[{"State",
                {state,{local,mnesia_sup},
                       one_for_all,
                       [{child,<0.38.0>,mnesia_kernel_sup,
                               {mnesia_kernel_sup,start,[]},
                               permanent,infinity,supervisor,...},
                        {child,<0.37.0>,mnesia_event,
                               {mnesia_sup,start_event,...},
                               permanent,30000,...}],
                       {dict,0,16,16,8,80,48,...},
                       0,3600,[],mnesia_sup,
                       [[]]}}]}]]}

Moreover, sys:get_status calls gen_server:format_status which in turn calls Module:format_status if it exists and uses that as the last element of Misc list, and as a consequence, we cannot be sure how to get the state (in order to find out the supervisor callback module). In other words, a quick fix that would match against the new result of sys:get_status might fail for application root supervisors that implement a custom format_status/2. However, format_status/2 callback itself is not documented in supervisor(3) (it is documented in gen_server(3))....

Since all this code is preceded with a note :

%% Note: The following is a terrible hack done in order to resolve the
%% problem stated in ticket OTP-3452.

I believe this deserves a rewrite (or a revert of 88b530ea24977081020feb2123124063e58dfc12).

Paul
-- 
Semiocast                    http://semiocast.com/
+33.175000290 - 62 bis rue Gay-Lussac, 75005 Paris



More information about the erlang-bugs mailing list