[erlang-questions] ssl_esock leaking file descriptors

Ingela Andin ingela@REDACTED
Fri Sep 2 21:09:42 CEST 2011


Hi!

In R15 you will be able to run the Erlang distrubution over the new
ssl implementation.  The plan is also to drop the old ssl
implementation in R15.

Regards Ingela Erlang/OTP-team,  Ericsson AB


2011/8/31 Gordon Guthrie <gordon@REDACTED>:
> We get an intermittent ssl_esock problem which I have never successful
> reproduced. It goes to 100% and the process needs to be manually killed.
> Richard Andrews also reported a problem with it going to 100% CPU in 2009:
> http://erlang.2086793.n4.nabble.com/ssl-esock-spinning-out-of-control-in-poll-td2117067.html
> He has a patch for that.
> It is on my 'long list' of things to fix but more frequent/reproducable ones
> allways get in the way.
> Gordon
>
> On 31 August 2011 15:13, Justin Milam <jsmilam@REDACTED> wrote:
>>
>> I've started to notice a slow leak of file descriptors in the ssl_esock
>> port. I'm running Erlang R14B and using SSL to encrypt traffic over the
>> Erlang distribution protocol. The cluster has 10 nodes minimum with
>> transient nodes joining and leaving the cluster regularly. From checking the
>> ssl_esock process with lsof it appears to be slowly leaking file
>> descriptors. The number of open file descriptors seems to increase after a
>> node joins the cluster and then leaves. Eventually ssl_esock holds open
>> enough file descriptors to hit the ulimit (currently 8192) in which case
>> ssl_esock goes into an infinite loop using near 100% of one of the CPUs.
>> I've been able to reproduce the issue by lowering the ulimit and
>> continually connecting/disconnecting a remote shell to a local running node
>> until the ulimit is reached. When ssl_esock is running in debug mode I see
>> the following being logged continually:
>> ==========LOOP=============
>> MASKS SET FOR FD: 27 (read) 26 (read) 25 (read) 24 (read) 19 (read) 18
>> (read) 17 (read) 16 (read) 12 (read) 11 (read) 10 (read) 9 (read) 8 (read) 7
>> (read) 6 (read)
>> CONNECTIONS:
>>  - DEFUNCT [0x8772978] (fd = 29)
>>  - DEFUNCT [0x86f9950] (fd = 28)
>>  - JOINED [0x875ae30] (origin = accept)
>>        (fd = 26, eof = 0, wq = 0, bp = 0)
>>        (proxyfd = 27, eof = 0, wq = 0, bp = 0)
>>  - JOINED [0x86fa970] (origin = accept)
>>        (fd = 24, eof = 0, wq = 0, bp = 0)
>>        (proxyfd = 25, eof = 0, wq = 0, bp = 0)
>>  - DEFUNCT [0x8733600] (fd = 21)
>>  - DEFUNCT [0x8732c38] (fd = 20)
>>  - JOINED [0x8733958] (origin = accept)
>>        (fd = 18, eof = 0, wq = 0, bp = 0)
>>        (proxyfd = 19, eof = 0, wq = 0, bp = 0)
>>  - JOINED [0x8734f78] (origin = accept)
>>        (fd = 16, eof = 0, wq = 0, bp = 0)
>>        (proxyfd = 17, eof = 0, wq = 0, bp = 0)
>>  - CONNECTED [0x87134a8] (fd = 15)
>>  - DEFUNCT [0x871f220] (fd = 13)
>>  - JOINED [0x87147d0] (origin = accept)
>>        (fd = 11, eof = 0, wq = 0, bp = 0)
>>        (proxyfd = 12, eof = 0, wq = 0, bp = 0)
>>  - JOINED [0x87083d0] (origin = connect)
>>        (fd = 9, eof = 0, wq = 0, bp = 0)
>>        (proxyfd = 10, eof = 0, wq = 0, bp = 0)
>>  - JOINED [0x86f29e8] (origin = connect)
>>        (fd = 7, eof = 0, wq = 0, bp = 0)
>>        (proxyfd = 8, eof = 0, wq = 0, bp = 0)
>>  - ACTIVE_LISTENING [0x86f2258] (fd = 6, acceptors = 1)
>> Before poll/select: 15 descriptors (total 29)
>> Error calling accept()
>> accept error (proxy_listensock): emfile
>> Has anyone else experienced such behavior?
>> Thanks
>> -justin
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>
>
>
> --
> Gordon Guthrie
> CEO hypernumbers
>
> http://hypernumbers.com
> t: hypernumbers
> +44 7776 251669
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>



More information about the erlang-questions mailing list