<div dir="ltr"><div>Hi Folks,<br>I've got a system using erlang/OTP 18.3.4.1 and rabbitmq 3.6.3. Everything is local to the system and there is no clustering.<br><br>We are seeing intermittent failures when stopping-uninstalling-reinstalling-starting epmd. <br><br>When this happens we also see many sockets stuck in close_wait like so:<br>tcp      48     0 <a href="http://0.0.0.0:4369">0.0.0.0:4369</a>           0.0.0.0:*              LISTEN     0         570937    1/systemd<br>tcp       5     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:37560">127.0.0.1:37560</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:42564">127.0.0.1:42564</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:53126">127.0.0.1:53126</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:40222">127.0.0.1:40222</a>        CLOSE_WAIT 0         0         -<br>tcp      38     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:33506">127.0.0.1:33506</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:56332">127.0.0.1:56332</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:50511">127.0.0.1:50511</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:45528">127.0.0.1:45528</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:59487">127.0.0.1:59487</a>        CLOSE_WAIT 0         0         -<br>tcp       4     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:37506">127.0.0.1:37506</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:41554">127.0.0.1:41554</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:40080">127.0.0.1:40080</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:32903">127.0.0.1:32903</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:48851">127.0.0.1:48851</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:35177">127.0.0.1:35177</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:44931">127.0.0.1:44931</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:54730">127.0.0.1:54730</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:48311">127.0.0.1:48311</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:39159">127.0.0.1:39159</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:47166">127.0.0.1:47166</a>        CLOSE_WAIT 0         0         -<br>tcp       2     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:37541">127.0.0.1:37541</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:38290">127.0.0.1:38290</a>        CLOSE_WAIT 0         0         -<br>tcp      31     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:43044">127.0.0.1:43044</a>        CLOSE_WAIT 0         0         -<br>tcp       2     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:37540">127.0.0.1:37540</a>        CLOSE_WAIT 0         0         -<br>tcp       2     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:37544">127.0.0.1:37544</a>        CLOSE_WAIT 0         0         -<br><br><br><br>On an identical working system the output looks like this:<br>tcp       0     0 <a href="http://0.0.0.0:4369">0.0.0.0:4369</a>           0.0.0.0:*              LISTEN     1/systemd<br>tcp       0     0 <ip address>:4369       <a href="http://9.47.80.245:36368">9.47.80.245:36368</a>      TIME_WAIT  -<br>tcp       0     0 <a href="http://127.0.0.1:34836">127.0.0.1:34836</a>        <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         ESTABLISHED 22713/beam.smp<br>tcp       0     0 <a href="http://127.0.0.1:4369">127.0.0.1:4369</a>         <a href="http://127.0.0.1:34836">127.0.0.1:34836</a>        ESTABLISHED 21186/epmd<br><br>on the hung system:<br>epmd -names and epmd -kill both hang indefinitely<br>Attempting to restart epmd.socket or epmd.service gives the error<br>epmd.socket failed to listen on sockets: Address already in use<br><br><br><br>Is there any way to <br>a) Get more information about what is causing the state to occur (so I can hopefully prevent it in the future)<br>or<br>b) Recover from this state (without rebooting the system)?<br><br></div>Thanks!<br></div>