recording process crash (in supervisor?)
Serge Aleynikov
serge@REDACTED
Fri Sep 30 14:05:05 CEST 2005
Rick Pettit wrote:
[...]
> Having to dig up the (undocumented?) format of the crash_report message (and
> count on it not changing across releases) troubles me (duh). Surely what I
> did must be "wrong", no?
You can look at proc_lib:format(Report) for formatting crash reports,
and extracting appropriate details.
> Also, it now seems clear that I need _two_ processes in addition to the
> supervisor to do a job that a simple supervisor callback could do just as
> well--one child/worker process (to invoke gen_event:add_sup_handler/3) and
> the actual gen_event handler process to receive and process error_logger
> messages.
Not quite. In reality you either don't need any additional processes,
or need one - a supervised guard of the event handler. The primary
difference between gen_event and gen_server is that gen_server runs in
the context of a dedicated process of its own, whereas gen_event runs in
the context of an EventManager to which the event handler is being
added. If (and only if) fault tolerance is needed, a separate process
can be used to trap event handler's crash messages. This is
accomplished by using gen_event:add_sup_handler/3, which will instruct
the EventManager to send a message to that process indicating that the
event handler was removed, but other than that this process will do
nothing. If you needed fault tolerance of the event handler, you could
add this worker process to a supervisor, where this process would simply
implement a loop:
init() ->
gen_event:add_sup_handler(error_logger, ?MODULE, []),
loop().
loop() ->
receive
{gen_event_EXIT, ?MODULE, Reason} ->
exit(Reason);
Other ->
loop(Handler)
end.
> Am I doing something unconventional here (i.e. processing/recording process
> crash info)? It seems like there should be an easier way. It also seems as
> though my error_logger handler, which only really cares about crash_report
> information, is going to have to "ignore" a whole lot of other messages which
> a supervisor (callback/handler) wouldn't even see--this seems needlessly
> inefficient.
If you examine the SASL's and KERNEL's error reporting, this is how its
done there (ignore irrelevant messages). I am not in position to
question the efficiency of this approach, as this hasn't been an issue
in the applications I've been building.
One thought though is that an OTP process crash is an infrequent event
(compared to all normal processing). Therefore the question about
efficiency of processing crash info might be irrelevant to the
efficiency of the system as a whole, given its rare likelihood.
Regards,
Serge
More information about the erlang-questions
mailing list