[erlang-questions] New OTP app available at GitHub: riak_err

Scott Lystig Fritchie fritchie@REDACTED
Fri Nov 12 01:06:53 CET 2010


Max Lapshin <max.lapshin@REDACTED> wrote:

ml> You have to modify gen_server to make error-proof message logging:
ml> https://github.com/erlyvideo/erlyvideo/blob/master/src/core/gen_server_ems.erl#L747

Max, my reading of the code says that the formatting is not done by the
gen_server process ... instead, the gen_server sends the format string &
args to error_logger, and error_logger's event handlers do the actual
formatting.

Included at the end of this message is a bunch of testing that I did
with a very intentionally buggy piece of code called foo.erl.  I use
three different variations to try to get it to mis-behave.  I can't get
riak_err to format something that's "too "big".  If someone can find a
counter-example (preferably an executable one :-), I'd love to hear it.

If I use gen_server:start_link() instead, then the {'EXIT', ...} being
caught by the shell will cause lib:format() to be executed, and *that*
can cause huge messages to be formatted.  But the (slightly sad) moral
of the story is to be careful of what's linked to the shell ... and
that's a lesson that most Erlang folks end up learning anyway.

ml> [...] or yours one is better, but I want OTP team to communicate
ml> about it.  gen_server:format_state is not enough because of dumping
ml> message queue and error reason (it includes dumping state). Problem
ml> is larger and it requires looking into all gen_ code and proc_lib
ml> infrastructure.

Agreed.  The default should be to have some kind of size limits (and
therefore RAM-consuming limit) on any OTP-generated event/message/etc.
If that default also helps allow my/your/others' code (and their events)
to also observe size limits when formatting events(*), that'd be even
better.

-Scott

(*) Apologies to Arthur C. Clarke: "I'm sorry, Dave, I can't allow you
to point that gun at your foot.  Here's a smaller thing you can use
instead."

--- snip --- snip --- snip --- snip --- snip --- snip ---

Common steps:

    riak_err_handler:set_fmt_max_bytes(100).                                
    riak_err_handler:set_term_max_size(100).
    f(Pid).                                                     
    {ok, Pid} = foo:start().  

Variation #1:

    foo:crash(Pid).

Output:

    (riak@REDACTED)164>     foo:crash(Pid).
    byebye
    (riak@REDACTED)165> 
    =ERROR REPORT==== 11-Nov-2010::17:48:04 ===
    Oversize args for format "** Generic server ~p terminating 
    ** Last message in was ~p~n** When Server state == ~p~n** Reason for termination == ~n** ~p~n": [<0.3676.1>,{crash,"mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm..."}...]
    =CRASH REPORT==== 11-Nov-2010::17:48:04 ===
      crasher:
        initial call: foo:init/1
        pid: <0.3676.1>
        registered_name: []
        exception exit: {call_was,{crash,"mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm..."}...}
          in function  gen_server:handle_msg/5
          in call from proc_lib:init_p_do_apply/3
        ancestors: [<0.3605.1>]
        messages: []
        links: []
        dictionary: []
        trap_exit: false
        status: running
        heap_size: 17711
        stack_size: 24
        reductions: 6774
      neighbours:

Variation #2:

    gen_server:cast(Pid, please_crash).

Output:

    (riak@REDACTED)169>     gen_server:cast(Pid, please_crash).
    ok
    (riak@REDACTED)170> 
    =ERROR REPORT==== 11-Nov-2010::17:49:18 ===
    Oversize args for format "** Generic server ~p terminating 
    ** Last message in was ~p~n** When Server state == ~p~n** Reason for termination == ~n** ~p~n": [<0.4236.1>,{$gen_cast,please_crash},{state,"sssssssssssssssssssssssssssssssssssssssssssssssssss..."}...]
    =CRASH REPORT==== 11-Nov-2010::17:49:18 ===
      crasher:
        initial call: foo:init/1
        pid: <0.4236.1>
        registered_name: []
        exception exit: {bummer,please_crash,<<"cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc...">>}
          in function  gen_server:terminate/6
          in call from proc_lib:init_p_do_apply/3
        ancestors: [<0.3605.1>]
        messages: []
        links: []
        dictionary: []
        trap_exit: false
        status: running
        heap_size: 10946
        stack_size: 24
        reductions: 6373
      neighbours:

Variation #3:

    Pid ! {arbitrary_message, lists:duplicate(2*1024, $a)}, atom_here_to_avoid_confusion_with_bangs_return_value_at_the_shell.

Output:

    (riak@REDACTED)174>     Pid ! {arbitrary_message, lists:duplicate(2*1024, $a)}, atom_here_to_avoid_confusion_with_bangs_return_value_at_the_shell.
    atom_here_to_avoid_confusion_with_bangs_return_value_at_the_shell
    
    =ERROR REPORT==== 11-Nov-2010::17:50:19 ===
    Oversize args for format "~w: ~s:handle_info got ~w
    ": [<0.4648.1>,foo,{arbitrary_message,"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa..."}]
    (riak@REDACTED)175> 

Code for foo.erl follows.

-module(foo).
-author('fritchie@REDACTED').

-behaviour(gen_server).

-define(NAME, ?MODULE).
-define(Timeout, infinity).

%% External exports
-export([start/0]).
-export([crash/1]).

%% gen_server callbacks
-export([init/1, handle_call/3, handle_cast/2, handle_info/2, terminate/2,
         code_change/3]).

-record(state, {big}).

start() ->
    gen_server:start(?MODULE, [], []).

crash(Pid) ->
    gen_server:call(Pid, {crash, lists:duplicate(2*1024, $m)}).

init([]) ->
    {ok, #state{big = lists:duplicate(2*1024, $s)}}.

handle_call(Request, _From, State) ->
    {stop, {call_was, Request, lists:duplicate(2*1024, $z)}, byebye, State}.

handle_cast(Msg, _State) ->
    exit({bummer, Msg, list_to_binary(lists:duplicate(2*1024, $c))}).

handle_info(Info, State) ->
    error_logger:error_msg("~w: ~s:handle_info got ~w\n", [self(), ?MODULE, Info]),
    {noreply, State}.

terminate(_Reason, _State) ->
    ok.

code_change(_X, _Y, Z) ->
    {ok, Z}.


More information about the erlang-questions mailing list