strange crashes of logger and supervisor

Lukas Larsson lukas@REDACTED
Fri Oct 9 13:43:55 CEST 2020


Hello,

This seems to be the logger proxy, i.e. the process that logs messages sent
by the emulator and other nodes.

It should not crash when receiving these late replies to gen_server:call/2.
I will prepare a patch for it.

In OTP-24 there is a proposal for process aliases (
https://github.com/erlang/otp/pull/2735) that should get rid of these types
of bugs.

Lukas

On Fri, Oct 9, 2020 at 11:47 AM Andreas Schultz <
andreas.schultz@REDACTED> wrote:

> Hi,
>
> If got some really strange crashes of the logger and errors from
> supervisor on 22.1.7:
>
> 2020-10-08T20:29:11.310911+00:00 error: Generic server <0.934.0>
> terminating. Reason:
> {function_clause,[{logger_proxy,handle_info,[{#Ref<0.3534272603.1552154625.127680>,ok},no_state],[{file,"logger_proxy.erl"},{line,136}]},{logger_olp,try_callback_call,4,[{file,"logger_olp.erl"},{line,598}]},{logger_olp,handle_info,2,[{file,"logger_olp.erl"},{line,270}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,680}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,756}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}.
> Last message: {#Ref<0.3534272603.1552154625.127680>,ok}. State:
> #{burst_limit_enable => false,burst_limit_max_count =>
> 500,burst_limit_window_time => 1000,burst_msg_count => 0,burst_win_ts =>
> -576460751230297,cb_state => no_state,drop_mode_qlen => 1000,flush_qlen =>
> 5000,id => logger_proxy,idle => false,last_load_ts =>
> -575306488755097,last_qlen => 1,mode => async,mode_ref =>
> {logger_olp,logger_proxy},module => logger_proxy,overload_kill_enable =>
> false,overload_kill_mem_size => 3000000,overload_kill_qlen =>
> 20000,overload_kill_restart_after => 5000,sync_mode_qlen => 500}.
> 2020-10-08T20:29:12.179509+00:00 error: Supervisor received unexpected
> message: {#Ref<0.3534272603.1029701639.174367>,ok}
> 2020-10-08T20:29:13.992900+00:00 error: Supervisor received unexpected
> message: {#Ref<0.3534272603.1029701639.174370>,ok}
> 2020-10-08T20:29:12.797713+00:00 error: crasher: initial call:
> logger_olp:init/1, pid: <0.934.0>, registered_name: [], error:
> {function_clause,[{logger_proxy,handle_info,[{#Ref<0.3534272603.1552154625.127680>,ok},no_state],[{file,"logger_proxy.erl"},{line,136}]},{logger_olp,try_callback_call,4,[{file,"logger_olp.erl"},{line,598}]},{logger_olp,handle_info,2,[{file,"logger_olp.erl"},{line,270}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,680}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,756}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]},
> ancestors: [logger_sup,kernel_sup,<0.902.0>], message_queue_len: 1,
> messages: [{#Ref<0.3534272603.1552154625.128792>,ok}], links: [<0.932.0>],
> dictionary: [{olp_ref,{logger_proxy,<0.934.0>,{logger_olp,logger_proxy}}}],
> trap_exit: true, status: running, heap_size: 987, stack_size: 28,
> reductions: 84949; neighbours:
> 2020-10-08T20:29:15.492989+00:00 error: Supervisor received unexpected
> message: {#Ref<0.3534272603.1322254338.95824>,ok}
> 2020-10-08T20:29:21.399535+00:00 error: Supervisor received unexpected
> message: {#Ref<0.3534272603.1322254338.95915>,ok}
> 2020-10-08T20:29:22.982660+00:00 error: Supervisor: {local,logger_sup}.
> Context: child_terminated. Reason:
> {function_clause,[{logger_proxy,handle_info,[{#Ref<0.3534272603.1552154625.127680>,ok},no_state],[{file,"logger_proxy.erl"},{line,136}]},{logger_olp,try_callback_call,4,[{file,"logger_olp.erl"},{line,598}]},{logger_olp,handle_info,2,[{file,"logger_olp.erl"},{line,270}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,680}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,756}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}.
> Offende
>
> It seems that somehow unhandled {Ref, ok} message got into the inbox of
> the servers. I'm pretty sure that my code is not sending those messages to
> random processes. Does anyone have an idea how this could happen?
>
> One clue might be that the system was under heavy load, possibly
> overloaded as this log entry suggests:
>
> 2020-10-08T20:28:15.676829+00:00 error: Error in process <0.4111.816> on
> node 'gtphub-c-node@REDACTED' with exit value:,
> {{badmatch,{error,timeout}},[{cpu_sup,measurement_server_init,0,[{file,"cpu_sup.erl"},{line,502}]}]}
>
> The stray message could be left over from a timed out gen call. If that is
> the case and late replies from calls can happen, then clearly the logger
> process should handle them more gracefully.
>
>
> --
>
> Andreas Schultz
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20201009/37a05b53/attachment.htm>


More information about the erlang-questions mailing list