[erlang-bugs] 100% CPU usage on Mac OS X Leopard after peer closes socket

Matthias Radestock matthias@REDACTED
Wed Apr 16 19:59:23 CEST 2008


The RabbitMQ team have received several reports from users running Mac 
OS X Leopard that sometimes the RabbitMQ beam process would consume 100% 
CPU when it was supposedly not doing anything, and never recover from that.

We eventually managed to reduce the problem to a test case captured by 
the attached program. It simply listens on a port, accepts a connection 
and then sends lots of data down the socket. The interesting bit is what 
happens when the peer closes the socket. To reproduce the abnormal 
behaviour just run

   sock_spin:broken(SomePort).

connect to the port with netcat, e.g.

   nc localhost SomePort > /dev/null

and then terminate the connection by ^C'ing netcat.

On Mac OS X Leopard this results in {error,einval} being returned (which 
is expected) and the beam process subsequently consuming 100% CPU 
(though the Erlang shell remains responsive).

By contrast, sock_spin:working/1 is behaving fine - after reporting the 
error the CPU drops to 0%.


Running 'dtruss -c -s -p <process>' on the spinning beam process shows 
the following system calls being repeated over and over again:

select(0xB, 0x400244, 0x400344, 0x0, 0xBFFFE69C)                 = 1 0
               libSystem.B.dylib`select$DARWIN_EXTSN+0xc
               beam`check_fd_events+0x124
               beam`erts_poll_wait_nkp+0xa0
               beam`erts_check_io_nkp+0xb0
               beam`erl_sys_schedule+0x90
               beam`schedule+0x2ec
               beam`process_main+0x18c
               beam`erl_start+0x1150
               beam`main+0x10
               beam`start+0x44

writev(0xA, 0x48DFBC, 0x2)               = -1 Err#32
               libSystem.B.dylib`writev$UNIX2003+0xc
               beam`tcp_inet_drv_output+0x14c
               beam`erts_port_ready_output+0x6c
               beam`erts_port_task_execute+0x24c
               beam`schedule+0x330
               beam`process_main+0x18c
               beam`erl_start+0x1150
               beam`main+0x10
               beam`start+0x44

getpeername(0xA, 0xBFFFE5BC, 0xBFFFE5B8)                 = -1 Err#22
               libSystem.B.dylib`getpeername$UNIX2003+0xc
               0xbfffe590
               beam`tcp_inet_drv_output+0x1ac
               beam`erts_port_ready_output+0x6c
               beam`erts_port_task_execute+0x24c
               beam`schedule+0x330
               beam`process_main+0x18c
               beam`erl_start+0x1150
               beam`main+0x10
               beam`start+0x44


We can reliably reproduce the spinning behaviour with R11B-5, R12B-1 and 
R12B-2 on Mac OS X Leopard on both ppc and x86. We cannot reproduce the 
problem on Mac OS X Tiger or various flavours of Debian or Windows.


Matthias.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: sock_spin.erl
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20080416/a81a49a6/attachment.ksh>


More information about the erlang-bugs mailing list