recording process crash (in supervisor?)
Rick Pettit
rpettit@REDACTED
Fri Sep 30 19:19:08 CEST 2005
On Fri, Sep 30, 2005 at 08:05:05AM -0400, Serge Aleynikov wrote:
> Rick Pettit wrote:
> [...]
> >Having to dig up the (undocumented?) format of the crash_report message
> >(and
> >count on it not changing across releases) troubles me (duh). Surely what I
> >did must be "wrong", no?
>
> You can look at proc_lib:format(Report) for formatting crash reports,
> and extracting appropriate details.
This is the documentation I was looking for--thank you.
> >Also, it now seems clear that I need _two_ processes in addition to the
> >supervisor to do a job that a simple supervisor callback could do just as
> >well--one child/worker process (to invoke gen_event:add_sup_handler/3) and
> >the actual gen_event handler process to receive and process error_logger
> >messages.
>
> Not quite. In reality you either don't need any additional processes,
> or need one - a supervised guard of the event handler. The primary
> difference between gen_event and gen_server is that gen_server runs in
> the context of a dedicated process of its own, whereas gen_event runs in
> the context of an EventManager to which the event handler is being
> added.
Duh, of course. Silly me.
> If (and only if) fault tolerance is needed, a separate process
> can be used to trap event handler's crash messages. This is
> accomplished by using gen_event:add_sup_handler/3, which will instruct
> the EventManager to send a message to that process indicating that the
> event handler was removed, but other than that this process will do
> nothing. If you needed fault tolerance of the event handler, you could
> add this worker process to a supervisor, where this process would simply
> implement a loop:
>
> init() ->
> gen_event:add_sup_handler(error_logger, ?MODULE, []),
> loop().
>
> loop() ->
> receive
> {gen_event_EXIT, ?MODULE, Reason} ->
> exit(Reason);
> Other ->
> loop(Handler)
> end.
Perfect--I think I finally see the light.
> >Am I doing something unconventional here (i.e. processing/recording process
> >crash info)? It seems like there should be an easier way. It also seems as
> >though my error_logger handler, which only really cares about crash_report
> >information, is going to have to "ignore" a whole lot of other messages
> >which
> >a supervisor (callback/handler) wouldn't even see--this seems needlessly
> >inefficient.
>
> If you examine the SASL's and KERNEL's error reporting, this is how its
> done there (ignore irrelevant messages). I am not in position to
> question the efficiency of this approach, as this hasn't been an issue
> in the applications I've been building.
Nor for me (IIRC the rule of thumb is to 1) make it work, 2) make it beautiful,
3) make it fast). I need to get past (1) first :-)
> One thought though is that an OTP process crash is an infrequent event
> (compared to all normal processing). Therefore the question about
> efficiency of processing crash info might be irrelevant to the
> efficiency of the system as a whole, given its rare likelihood.
Agreed.
You have been very helpful, thanks again.
-Rick
More information about the erlang-questions
mailing list