strange crashes of logger and supervisor

Lukas Larsson lukas@REDACTED
Tue Dec 1 08:55:59 CET 2020


Merged a fix to maint for this problem:
https://github.com/erlang/otp/pull/2884

On Fri, Oct 9, 2020 at 1:43 PM Lukas Larsson <lukas@REDACTED> wrote:

> Hello,
>
> This seems to be the logger proxy, i.e. the process that logs messages
> sent by the emulator and other nodes.
>
> It should not crash when receiving these late replies to
> gen_server:call/2. I will prepare a patch for it.
>
> In OTP-24 there is a proposal for process aliases (
> https://github.com/erlang/otp/pull/2735) that should get rid of these
> types of bugs.
>
> Lukas
>
> On Fri, Oct 9, 2020 at 11:47 AM Andreas Schultz <
> andreas.schultz@REDACTED> wrote:
>
>> Hi,
>>
>> If got some really strange crashes of the logger and errors from
>> supervisor on 22.1.7:
>>
>> 2020-10-08T20:29:11.310911+00:00 error: Generic server <0.934.0>
>> terminating. Reason:
>> {function_clause,[{logger_proxy,handle_info,[{#Ref<0.3534272603.1552154625.127680>,ok},no_state],[{file,"logger_proxy.erl"},{line,136}]},{logger_olp,try_callback_call,4,[{file,"logger_olp.erl"},{line,598}]},{logger_olp,handle_info,2,[{file,"logger_olp.erl"},{line,270}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,680}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,756}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}.
>> Last message: {#Ref<0.3534272603.1552154625.127680>,ok}. State:
>> #{burst_limit_enable => false,burst_limit_max_count =>
>> 500,burst_limit_window_time => 1000,burst_msg_count => 0,burst_win_ts =>
>> -576460751230297,cb_state => no_state,drop_mode_qlen => 1000,flush_qlen =>
>> 5000,id => logger_proxy,idle => false,last_load_ts =>
>> -575306488755097,last_qlen => 1,mode => async,mode_ref =>
>> {logger_olp,logger_proxy},module => logger_proxy,overload_kill_enable =>
>> false,overload_kill_mem_size => 3000000,overload_kill_qlen =>
>> 20000,overload_kill_restart_after => 5000,sync_mode_qlen => 500}.
>> 2020-10-08T20:29:12.179509+00:00 error: Supervisor received unexpected
>> message: {#Ref<0.3534272603.1029701639.174367>,ok}
>> 2020-10-08T20:29:13.992900+00:00 error: Supervisor received unexpected
>> message: {#Ref<0.3534272603.1029701639.174370>,ok}
>> 2020-10-08T20:29:12.797713+00:00 error: crasher: initial call:
>> logger_olp:init/1, pid: <0.934.0>, registered_name: [], error:
>> {function_clause,[{logger_proxy,handle_info,[{#Ref<0.3534272603.1552154625.127680>,ok},no_state],[{file,"logger_proxy.erl"},{line,136}]},{logger_olp,try_callback_call,4,[{file,"logger_olp.erl"},{line,598}]},{logger_olp,handle_info,2,[{file,"logger_olp.erl"},{line,270}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,680}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,756}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]},
>> ancestors: [logger_sup,kernel_sup,<0.902.0>], message_queue_len: 1,
>> messages: [{#Ref<0.3534272603.1552154625.128792>,ok}], links: [<0.932.0>],
>> dictionary: [{olp_ref,{logger_proxy,<0.934.0>,{logger_olp,logger_proxy}}}],
>> trap_exit: true, status: running, heap_size: 987, stack_size: 28,
>> reductions: 84949; neighbours:
>> 2020-10-08T20:29:15.492989+00:00 error: Supervisor received unexpected
>> message: {#Ref<0.3534272603.1322254338.95824>,ok}
>> 2020-10-08T20:29:21.399535+00:00 error: Supervisor received unexpected
>> message: {#Ref<0.3534272603.1322254338.95915>,ok}
>> 2020-10-08T20:29:22.982660+00:00 error: Supervisor: {local,logger_sup}.
>> Context: child_terminated. Reason:
>> {function_clause,[{logger_proxy,handle_info,[{#Ref<0.3534272603.1552154625.127680>,ok},no_state],[{file,"logger_proxy.erl"},{line,136}]},{logger_olp,try_callback_call,4,[{file,"logger_olp.erl"},{line,598}]},{logger_olp,handle_info,2,[{file,"logger_olp.erl"},{line,270}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,680}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,756}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}.
>> Offende
>>
>> It seems that somehow unhandled {Ref, ok} message got into the inbox of
>> the servers. I'm pretty sure that my code is not sending those messages to
>> random processes. Does anyone have an idea how this could happen?
>>
>> One clue might be that the system was under heavy load, possibly
>> overloaded as this log entry suggests:
>>
>> 2020-10-08T20:28:15.676829+00:00 error: Error in process <0.4111.816> on
>> node 'gtphub-c-node@REDACTED' with exit value:,
>> {{badmatch,{error,timeout}},[{cpu_sup,measurement_server_init,0,[{file,"cpu_sup.erl"},{line,502}]}]}
>>
>> The stray message could be left over from a timed out gen call. If that
>> is the case and late replies from calls can happen, then clearly the logger
>> process should handle them more gracefully.
>>
>>
>> --
>>
>> Andreas Schultz
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20201201/bd2b62ed/attachment.htm>


More information about the erlang-questions mailing list