[erlang-questions] Abandoned (ranch) connection processes?

Roger Lipscombe roger@REDACTED
Thu Feb 15 16:54:18 CET 2018


OK. I did some spelunking and it seems to be strongly correlated with
the appearance of the following in my logs:

SSL: {connection,{alert,2,20,{"tls_record.erl",488},undefined}}:
ssl_connection.erl:861:Fatal error: unexpected message

If I find the "abandoned" ranch protocol processes, and the associated
SSL pid, and then look for it in my log files (with log_alerts
enabled), I get one of the above messages for every one I've looked at
so far.

On 15 February 2018 at 11:55, Roger Lipscombe <roger@REDACTED> wrote:
> Related: this particular set of servers was suffering previously from
> the bug fixed in
> https://github.com/erlang/otp/commit/a5434a323afb1972195aa5f55b4894595df2c24f.
>
> We have a *lot* of battery-powered, Wifi-connected devices. It only
> takes a small fraction of a percent of them having a bad day to
> exercise all kinds of error conditions.
>
> Is it possible that there's another path through here that's not
> cleaning up properly?
>
> On 15 February 2018 at 11:23, Roger Lipscombe <roger@REDACTED> wrote:
>> OK. Let me rephrase that:
>>
>> - {active, once} obviously has something in place to handle data
>> arriving and closed sockets *in between* calls to {active, once} --
>> i.e. it'll be {active, false} for a brief interval. I last looked at
>> this code in 17.x (before the gen_statem refactoring), so I'm not sure
>> where it lives now.
>> - does it deal correctly with closed sockets that close before the
>> *first* call to {active, once}? In other words: can I expect an
>> ssl_closed message in this case? Is there something special about the
>> first call?
>>
>> On 15 February 2018 at 11:15, Dmitry Kolesnikov <dmkolesnikov@REDACTED> wrote:
>>> Hello,
>>>
>>> On 15 Feb 2018, at 13.08, Roger Lipscombe <roger@REDACTED> wrote:
>>>
>>> The only thing I can think of is that the socket is being closed
>>> between ranch:accept_ack and Transport:setopts, and Erlang's not
>>> sending the ssl_closed message. Does this sound likely? How do I deal
>>> with this?
>>>
>>>
>>> No, it does not sound likely! The bug is either at ranch or your code.
>>>
>>> I think you should try to verify the result of each socket operation before
>>> going further on. I am referring here to your statement: I'm *not* verifying
>>> the result.
>>>
>>>
>>> Best Regards,
>>> Dmitry



More information about the erlang-questions mailing list