os_mon & alarm_handler in R10B-10

Micael Karlberg bmk@REDACTED
Tue Mar 28 11:10:48 CEST 2006


Hi,

The new (and delete) function is an optional one. Also
there is no defined return value for this function.
Therefor it's not worth the effort to try to figure if
the result is ok or not.

/BMK

Serge Aleynikov wrote:
> Gunilla,
> 
> I believe there might be another bug in SNMP revealed by my experiments 
> with OS_MON & OTP_MIBS.  If mnesia is started *after* the snmp agent, 
> and the snmp agent has the mibs parameter set, an attempt to initialize 
> mib OIDs using instrumentation functions with the 'new' operation (such 
> as otp_mib:erl_node_table(new)), leads to an ignored exception that 
> ideally should prevent the SNMP agent from starting.
> 
> Release file:
> =============
> {release, {"dripdb", "1.0"}, {erts, "5.4.13"},
>   [
>     {kernel  , "2.10.13"},
>     {stdlib  , "1.13.12"},
>     {sasl    , "2.1.1"},
>     {lama    , "1.0"},
>     {otp_mibs, "1.0.4"},
>     {os_mon  , "2.0"},
>     {snmp    , "4.7.1"},
>     {mnesia  , "4.2.5"}
>   ]
> }.
> 
> Config file:
> ============
> 
> %%------------ SNMP agent configuration ----------------------
>   {snmp,
>      [{agent,
>         [{config, [{dir, "etc/snmp/"},
>                    {force_load, true}
>                   ]},
>          {db_dir, "var/snmp_db/"},
>          {mibs,   ["mibs/priv/OTP-MIB",
>                    "mibs/priv/OTP-OS-MON-MIB"]}
>         ]
>       }
>      ]
>   }
> 
> This is a trace of the error which hides the fact that there was a 
> problem with creation of the 'erlNodeAlloc' table:
> 
> (<0.126.0>) call 
> snmpa_mib_data:call_instrumentation({me,[1,3,6,1,4,1,193,19,3,1,2,1,1,1],
>     table_entry,
>     erlNodeEntry,
>     undefined,
>     'not-accessible',
>     {otp_mib,erl_node_table,[]},
>     false,
>     [{table_entry_with_sequence,'ErlNodeEntry'}],
>     undefined,
>     undefined},new)
> (<0.126.0>) returned from snmpa_mib_data:call_instrumentation/2 ->
>   {'EXIT',{aborted,{node_not_running,drpdb@REDACTED}}}
> 
> Therefore all the SNMP manager's calls to OIDs inside 'erlNodeTable' or 
> 'applTable' tables fail.
> 
> I can provide additional details if needed, if the information here is 
> not sufficient.  I believe the proper action to do would be not to 
> absorb the error in the call_instrumentation function when the Operation 
> is 'new'.  I am providing the snippet of code where that exception is 
> currently ignored:
> 
> snmpa_mib_data.erl(line 1319):
> ==============================
> call_instrumentation(#me{entrytype = variable, mfa={M,F,A}}, Operation) ->
>     ?vtrace("call instrumentation with"
>         "~n   entrytype: variable"
>         "~n   MFA:       {~p,~p,~p}"
>         "~n   Operation: ~p",
>         [M,F,A,Operation]),
>     catch apply(M, F, [Operation | A]);
> ...
> 
> 
> Regards,
> 
> Serge
> 
> 
> Gunilla Arendt wrote:
> 
>> It's a bug in os_mon, it shouldn't use get_alarms().
>> Thanks for the heads up.
>>
>> Regards, Gunilla
>>
>>
>> Serge Aleynikov wrote:
>>
>>> For now I used the following patch to take care of this issue, but I 
>>> would be curious to hear the opinion of the OTP staff.
>>>
>>> Regards,
>>>
>>> Serge
>>>
>>> --- alarm_handler.erl.orig      Fri Mar 24 20:08:18 2006
>>> +++ alarm_handler.erl   Fri Mar 24 20:19:15 2006
>>> @@ -58,7 +58,12 @@
>>>  %% Returns: [{AlarmId, AlarmDesc}]
>>>  %%-----------------------------------------------------------------
>>>  get_alarms() ->
>>> -    gen_event:call(alarm_handler, alarm_handler, get_alarms).
>>> +    case gen_event:which_handlers(alarm_handler) of
>>> +    [M | _] ->
>>> +        gen_event:call(alarm_handler, M, get_alarms);
>>> +    [] ->
>>> +        []
>>> +    end.
>>>
>>>  add_alarm_handler(Module) when atom(Module) ->
>>>      gen_event:add_handler(alarm_handler, Module, []).
>>>
>>>
>>> Serge Aleynikov wrote:
>>>
>>>> Hi,
>>>>
>>>> I've been experimenting with the reworked os_mon in R10B-10, and 
>>>> encountered the following issue.
>>>>
>>>> The documentation encourages to replace the default alarm handler 
>>>> with something more sophisticated.  For that reason I created a 
>>>> custom handler - lama_alarm_h (LAMA app in jungerl), which uses 
>>>> gen_event:swap_sup_handler/3.
>>>>
>>>> I initiate that handler prior to starting OS_MON, and then start 
>>>> OS_MON.
>>>>
>>>> In the latest release R10B-10, OS_MON calls 
>>>> alarm_handler:get_alarms/0 upon startup.
>>>>
>>>> This causes the 'alarm_handler' event manager issue a call in the 
>>>> alarm_handler.erl module.  However, since that handler was replaced 
>>>> by a custom alarm handler, the gen_event's call fails with
>>>> {error, bad_module}.
>>>>
>>>> gen_event always dispatches a call/3 to a specific handler module 
>>>> passed as a parameter, e.g.:
>>>>
>>>> -----[alarm_handler.erl (line: 60)]-----
>>>> get_alarms() ->
>>>>     gen_event:call(alarm_handler, alarm_handler, get_alarms).
>>>> ----------------------------------------
>>>>
>>>> Yet, if the alarm_handler handler was swapped by another module, the 
>>>> gen_event:call will report an error, therefore crashing OS_MON.
>>>>
>>>> One way to resolve this problem would be to introduce another 
>>>> exported function in gen_event:
>>>>
>>>> gen_event:call(EventMgrRef, Request) -> Result
>>>>
>>>> Can the OTP team suggest some other workaround?
>>>>
>>>> Serge
>>>>
>>>
>>
>>
> 



More information about the erlang-questions mailing list