[erlang-bugs] R11B-5 to R12B-3 forks beams eating 100% CPU
Lev Walkin
vlm@REDACTED
Tue Jun 17 12:57:06 CEST 2008
Update:
after all, it seems that yaws:call_cgi() is a culprit. It invokes
the beam's fork(), which interacts badly with underlying threading
library:
0x0000000800b8b8ca in pthread_sigmask () from /lib/libpthread.so.2
(gdb) bt
#0 0x0000000800b8b8ca in pthread_sigmask () from /lib/libpthread.so.2
#1 0x0000000800b8b876 in sigprocmask () from /lib/libpthread.so.2
#2 0x0000000800b95809 in pthread_mutexattr_init () from
/lib/libpthread.so.2
#3 0x0000000800b8883a in fork () from /lib/libpthread.so.2
#4 0x000000000052eea6 in fini_getenv_state ()
#5 0x0000000000482a1a in erts_open_driver ()
#6 0x00000000004f0e7d in port_get_data_1 ()
#7 0x00000000004eef88 in open_port_2 ()
#8 0x0000000000508e9c in process_main ()
#9 0x0000000000445701 in erl_start ()
#10 0x000000000042b39b in main ()
(gdb) c
Continuing.
no thread to satisfy query
0x0000000800b8b8ca in pthread_sigmask () from /lib/libpthread.so.2
It turned out that FreeBSD port adds --enable-threads to the
erts configuration option. I tried two things:
1. Disabled threading in run time, by specifying +A 0 (my default
was +A 32). This setting has no effect on the rating of runaway
beams' creation. That is, new beams still sporadically pop out
eating 100% CPU.
2. Disabling threading by removing the --enable-threads from
the Erlang port's configuration. This appears to fix the issue.
I firmly believe there should exist a more legitimate fix.
Is there anyone willing to perform a consultant's job of fixing this
hanging fork() issue under async-threading/FreeBSD/call_cgi, and
receive money in exchange for their services?
Lev Walkin wrote:
> correction:
>
> Disabling yaws application DOES seem to have effect on
> the rate of runaway beam creation.
>
>
> Lev Walkin wrote:
>> SYMPTOMS:
>>
>> A running, unstressed erlang system appears to spontaneously
>> fork off BEAM processes (beam) several times an hour,
>> each eating 100% CPU. (If there are more than one at a time,
>> they split the CPU accordingly).
>>
>> INVESTIGATION:
>>
>> The BEAM processes forked appear to be truly separate processes.
>> Killing them (-9) does not seem to harm the main VM, which
>> continues to execute the necessary set of applications
>> (kernel, stdlib, sasl, ssl, yaws, plus two inhouse ones).
>>
>> Disabling yaws application does not seem to have any effect on
>> the rate of runaway beam creation.
>>
>> ktrace shows a total absence of any system calls in the runaway
>> beam process, suggesting an infinite loop. Killing the process
>> results in a single ktrace event:
>>
>> [root@REDACTED ~]# kdump
>> 97445 beam PSIG SIGKILL SIG_DFL
>> [root@REDACTED ~]#
>>
>> The R12B-3 has the following fix:
>>
>>
>> OTP-7289 On Mac OS 10.5 (Leopard), sending to socket
>> which the other end closes could cause the
>> emulator to consume 100% CPU time.
>> (Thanks to Matthias Radestock.)
>>
>> Since Mac OS is sufficiently similar to FreeBSD, I had hoped
>> the OTP-7289 has direct relation to my runaway beam problem.
>> Alas, R12B-3 behaves exactly the way R11B-5 and the latter
>> versions of Erlang VM behave, that is, runaway beams
>> are continuing to appear on a regular basis.
>>
>> WORKAROUND USED:
>>
>> Created an external process which periodically detects
>> and kills the runaway beams. Ugly.
>>
>> CONFIGURATION:
>>
>> Erlang (BEAM) emulator version 5.6.3 [source] [64-bit]
>> [async-threads:0] [hipe] [kernel-poll:false]
>>
>> FreeBSD host 6.3-RELEASE FreeBSD 6.3-RELEASE #0: Wed Jan 16 01:43:02
>> UTC 2008 root@REDACTED:/usr/obj/usr/src/sys/SMP amd64
>>
>> However, the system was running all versions of erlang VM
>> between R11B-5 and R12B-3 with the same results, so
>> it is not version specific.
>>
>>
>> Please suggest further steps to debug or eliminate the problem.
>>
>>
>
> _______________________________________________
> erlang-bugs mailing list
> erlang-bugs@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-bugs
More information about the erlang-bugs
mailing list