The Mystery of the Vanishing Message's Dead Lock

Fred Hebert mononcqc@REDACTED
Thu Jul 23 22:52:35 CEST 2020


The release notes for ERTS-10.7.1 mention:

Fixed a bug in a receive optimization. This could cause a receive not to
> match even though a matching message was present in the message queue. This
> bug was introduced in ERTS version 10.6 (OTP 22.2).
> Own Id: OTP-16572 Aux Id: ERL-1199, OTP-16269


We were on ERTS-10.6.2 (OTP-22.1.??). Today we reproduced it again and I
knew just what to look for so:

(rabbit@REDACTED)16> lists:filter(fun({_Pid, _Ref, {sync_notify,
_}}) -> false; (_) -> true end,
(rabbit@REDACTED)16>
element(2,recon:info(rabbit_log_connection_lager_event, messages))).
[{#Ref<0.229770800.3298820110.79067>,ok}]

And where we have it. Out of 85,000 messages or so in that worker, only one
of them wasn't a blocked sync_notify, and it was a message of the form {Ref,
ok} which entirely matches the compiler bug. I can't say for sure it's the
right response, but assuming things are synchronous there's no reason we'd
have another one either. We're upgrading to Erlang/OTP-22.3.1 or newer
(which contains ERTS-10.7.1). I'm very surprised that the compiler bug had
an effect on gen:call() rather than just the branching pattern in socket
(but somewhat relieved because it would otherwise imply another bug). Out
of caution, we'll have engineers try to reproduce it once more to see if
that fixes it.

On Thu, Jul 23, 2020 at 3:53 PM Michael Truog <mjtruog@REDACTED> wrote:

> On 7/23/20 9:18 AM, Fred Hebert wrote:
>
>
> The two things I have as a theory for now is either:
>
>    1. the reply was received but not processed (the bug at
>    https://bugs.erlang.org/browse/ERL-1076 could be at play but the bug
>    report's optimization requires a conditional that doesn't match the format
>    of gen:call() here)
>    2. the reply was "sent" but never actually enqueued in the destination
>    process
>
>
> Hi Fred,
>
> ERL-1076 was fixed in 21.3.8.11 [1] and 22.1.6 [2] (with 23.x getting
> released afterwards).  Are you using an older Erlang/OTP version?
>
> Best Regards,
> Michael
>
> [1]
> https://erlang.org/pipermail/erlang-questions/2019-November/098785.html
> [2]
> https://erlang.org/pipermail/erlang-questions/2019-November/098716.html
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20200723/c499622f/attachment.htm>


More information about the erlang-questions mailing list