os_mon & alarm_handler in R10B-10
Serge Aleynikov
serge@REDACTED
Mon Mar 27 20:32:12 CEST 2006
Gunilla,
I believe there might be another bug in SNMP revealed by my experiments
with OS_MON & OTP_MIBS. If mnesia is started *after* the snmp agent,
and the snmp agent has the mibs parameter set, an attempt to initialize
mib OIDs using instrumentation functions with the 'new' operation (such
as otp_mib:erl_node_table(new)), leads to an ignored exception that
ideally should prevent the SNMP agent from starting.
Release file:
=============
{release, {"dripdb", "1.0"}, {erts, "5.4.13"},
[
{kernel , "2.10.13"},
{stdlib , "1.13.12"},
{sasl , "2.1.1"},
{lama , "1.0"},
{otp_mibs, "1.0.4"},
{os_mon , "2.0"},
{snmp , "4.7.1"},
{mnesia , "4.2.5"}
]
}.
Config file:
============
%%------------ SNMP agent configuration ----------------------
{snmp,
[{agent,
[{config, [{dir, "etc/snmp/"},
{force_load, true}
]},
{db_dir, "var/snmp_db/"},
{mibs, ["mibs/priv/OTP-MIB",
"mibs/priv/OTP-OS-MON-MIB"]}
]
}
]
}
This is a trace of the error which hides the fact that there was a
problem with creation of the 'erlNodeAlloc' table:
(<0.126.0>) call
snmpa_mib_data:call_instrumentation({me,[1,3,6,1,4,1,193,19,3,1,2,1,1,1],
table_entry,
erlNodeEntry,
undefined,
'not-accessible',
{otp_mib,erl_node_table,[]},
false,
[{table_entry_with_sequence,'ErlNodeEntry'}],
undefined,
undefined},new)
(<0.126.0>) returned from snmpa_mib_data:call_instrumentation/2 ->
{'EXIT',{aborted,{node_not_running,drpdb@REDACTED}}}
Therefore all the SNMP manager's calls to OIDs inside 'erlNodeTable' or
'applTable' tables fail.
I can provide additional details if needed, if the information here is
not sufficient. I believe the proper action to do would be not to
absorb the error in the call_instrumentation function when the Operation
is 'new'. I am providing the snippet of code where that exception is
currently ignored:
snmpa_mib_data.erl(line 1319):
==============================
call_instrumentation(#me{entrytype = variable, mfa={M,F,A}}, Operation) ->
?vtrace("call instrumentation with"
"~n entrytype: variable"
"~n MFA: {~p,~p,~p}"
"~n Operation: ~p",
[M,F,A,Operation]),
catch apply(M, F, [Operation | A]);
...
Regards,
Serge
Gunilla Arendt wrote:
> It's a bug in os_mon, it shouldn't use get_alarms().
> Thanks for the heads up.
>
> Regards, Gunilla
>
>
> Serge Aleynikov wrote:
>
>> For now I used the following patch to take care of this issue, but I
>> would be curious to hear the opinion of the OTP staff.
>>
>> Regards,
>>
>> Serge
>>
>> --- alarm_handler.erl.orig Fri Mar 24 20:08:18 2006
>> +++ alarm_handler.erl Fri Mar 24 20:19:15 2006
>> @@ -58,7 +58,12 @@
>> %% Returns: [{AlarmId, AlarmDesc}]
>> %%-----------------------------------------------------------------
>> get_alarms() ->
>> - gen_event:call(alarm_handler, alarm_handler, get_alarms).
>> + case gen_event:which_handlers(alarm_handler) of
>> + [M | _] ->
>> + gen_event:call(alarm_handler, M, get_alarms);
>> + [] ->
>> + []
>> + end.
>>
>> add_alarm_handler(Module) when atom(Module) ->
>> gen_event:add_handler(alarm_handler, Module, []).
>>
>>
>> Serge Aleynikov wrote:
>>
>>> Hi,
>>>
>>> I've been experimenting with the reworked os_mon in R10B-10, and
>>> encountered the following issue.
>>>
>>> The documentation encourages to replace the default alarm handler
>>> with something more sophisticated. For that reason I created a
>>> custom handler - lama_alarm_h (LAMA app in jungerl), which uses
>>> gen_event:swap_sup_handler/3.
>>>
>>> I initiate that handler prior to starting OS_MON, and then start OS_MON.
>>>
>>> In the latest release R10B-10, OS_MON calls
>>> alarm_handler:get_alarms/0 upon startup.
>>>
>>> This causes the 'alarm_handler' event manager issue a call in the
>>> alarm_handler.erl module. However, since that handler was replaced
>>> by a custom alarm handler, the gen_event's call fails with
>>> {error, bad_module}.
>>>
>>> gen_event always dispatches a call/3 to a specific handler module
>>> passed as a parameter, e.g.:
>>>
>>> -----[alarm_handler.erl (line: 60)]-----
>>> get_alarms() ->
>>> gen_event:call(alarm_handler, alarm_handler, get_alarms).
>>> ----------------------------------------
>>>
>>> Yet, if the alarm_handler handler was swapped by another module, the
>>> gen_event:call will report an error, therefore crashing OS_MON.
>>>
>>> One way to resolve this problem would be to introduce another
>>> exported function in gen_event:
>>>
>>> gen_event:call(EventMgrRef, Request) -> Result
>>>
>>> Can the OTP team suggest some other workaround?
>>>
>>> Serge
>>>
>>
>
>
--
Serge Aleynikov
R&D Telecom, IDT Corp.
Tel: (973) 438-3436
Fax: (973) 438-1464
serge@REDACTED
More information about the erlang-questions
mailing list