[erlang-questions] ssl_esock leaking file descriptors

Justin Milam jsmilam@REDACTED
Wed Aug 31 16:13:30 CEST 2011


I've started to notice a slow leak of file descriptors in the ssl_esock
port. I'm running Erlang R14B and using SSL to encrypt traffic over the
Erlang distribution protocol. The cluster has 10 nodes minimum with
transient nodes joining and leaving the cluster regularly. From checking the
ssl_esock process with lsof it appears to be slowly leaking file
descriptors. The number of open file descriptors seems to increase after a
node joins the cluster and then leaves. Eventually ssl_esock holds open
enough file descriptors to hit the ulimit (currently 8192) in which case
ssl_esock goes into an infinite loop using near 100% of one of the CPUs.

I've been able to reproduce the issue by lowering the ulimit and continually
connecting/disconnecting a remote shell to a local running node until the
ulimit is reached. When ssl_esock is running in debug mode I see the
following being logged continually:

==========LOOP=============
MASKS SET FOR FD: 27 (read) 26 (read) 25 (read) 24 (read) 19 (read) 18
(read) 17 (read) 16 (read) 12 (read) 11 (read) 10 (read) 9 (read) 8 (read) 7
(read) 6 (read)
CONNECTIONS:
 - DEFUNCT [0x8772978] (fd = 29)
 - DEFUNCT [0x86f9950] (fd = 28)
 - JOINED [0x875ae30] (origin = accept)
       (fd = 26, eof = 0, wq = 0, bp = 0)
       (proxyfd = 27, eof = 0, wq = 0, bp = 0)
 - JOINED [0x86fa970] (origin = accept)
       (fd = 24, eof = 0, wq = 0, bp = 0)
       (proxyfd = 25, eof = 0, wq = 0, bp = 0)
 - DEFUNCT [0x8733600] (fd = 21)
 - DEFUNCT [0x8732c38] (fd = 20)
 - JOINED [0x8733958] (origin = accept)
       (fd = 18, eof = 0, wq = 0, bp = 0)
       (proxyfd = 19, eof = 0, wq = 0, bp = 0)
 - JOINED [0x8734f78] (origin = accept)
       (fd = 16, eof = 0, wq = 0, bp = 0)
       (proxyfd = 17, eof = 0, wq = 0, bp = 0)
 - CONNECTED [0x87134a8] (fd = 15)
 - DEFUNCT [0x871f220] (fd = 13)
 - JOINED [0x87147d0] (origin = accept)
       (fd = 11, eof = 0, wq = 0, bp = 0)
       (proxyfd = 12, eof = 0, wq = 0, bp = 0)
 - JOINED [0x87083d0] (origin = connect)
       (fd = 9, eof = 0, wq = 0, bp = 0)
       (proxyfd = 10, eof = 0, wq = 0, bp = 0)
 - JOINED [0x86f29e8] (origin = connect)
       (fd = 7, eof = 0, wq = 0, bp = 0)
       (proxyfd = 8, eof = 0, wq = 0, bp = 0)
 - ACTIVE_LISTENING [0x86f2258] (fd = 6, acceptors = 1)
Before poll/select: 15 descriptors (total 29)
Error calling accept()
accept error (proxy_listensock): emfile

Has anyone else experienced such behavior?

Thanks

-justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110831/5b937b68/attachment.htm>


More information about the erlang-questions mailing list