[erlang-questions] processes stuck in erlang:bif_return_trap/1
H.Li@REDACTED
H.Li@REDACTED
Mon Sep 8 22:30:16 CEST 2014
Dear All,
I wonder if anyone could help me with this.
We have been experiencing some problems with an Erlang node running
Mnesia. For some reason, the node becomes unresponsive every few minutes,
but the system does recover after a short period of time.
We did some profiling using ETop, and it shows that when the system
freezes, there are about 5000 processes accumulated on this node getting
stuck in the erlang:bif_return_trap/1 function. These processes are
spawned by a rpc_server running on this node; each process computes a
M:F(A) then sends the result back via gen_tcp or stores the result in an
ets table.
I don't quite understand what erlang:bif_return_trap/1 does, and am
confused why so many processes got stuck in this function. The Erlang node
is running on a 12 physical core machine (hence 24 schedulers), and the
version of Erlang is R15B03. Here is part of the ETop output:
Load: cpu 317 Memory: total 66521736 binary 393593
procs 5785 processes 1253072 code 12280
runq 0 atom 493 ets 64843631
Pid Name or Initial Func Time Reds Memory MsgQ
Current Function
----------------------------------------------------------------------------------------
<5291.115.0> mnesia_tm '-' 0******** 0
mnesia_tm:doit_loop/
<5291.6.0> error_logger '-' 012342496 0
gen_event:fetch_msg/
<5291.61.0> memsup '-' 138 4714912 0
gen_server:loop/6
<5291.1464.0> proc_lib:init_p/5 '-' 103 2914384 0
gen_fsm:loop/7
<5291.1466.0> proc_lib:init_p/5 '-' 105 2914384 0
gen_fsm:loop/7
.
.
.
***************rpc_socket:worker/6 '-' 8141 101144 0
erlang:bif_return_tr
***************rpc_socket:worker/6 '-' 7900 101144 0
erlang:bif_return_tr
***************rpc_socket:worker/6 '-' 7901 101144 0
erlang:bif_return_tr
***************rpc_socket:worker/6 '-' 7954 101144 0
erlang:bif_return_tr
***************rpc_socket:worker/6 '-' 7975 101144 0
erlang:bif_return_tr
***************rpc_socket:worker/6 '-' 8068 101144 0
erlang:bif_return_tr
***************rpc_socket:worker/6 '-' 7956 101144 0
erlang:bif_return_tr
***************rpc_socket:worker/6 '-' 8212 101144 0
erlang:bif_return_tr
Many Thanks!
Huiqing
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140908/e5601703/attachment.htm>
More information about the erlang-questions
mailing list