[erlang-questions] Erlang hangs in supervisor:do_terminate/2

Nico Kruber nico.kruber@REDACTED
Sat Jul 11 14:33:48 CEST 2015


Hi,
I'm having trouble with supervisor:do_terminate/2 in both, Erlang 18.0.1 and 
18.0.2 which I haven't seen with earlier versions so far. I currently do not 
have a minimal use case but will try to come up with something soon.

I'm using Scalaris and inside a single process, I'm starting its services 
(multiple processes in a supervisor tree) and stopping them again in a loop.
Sometimes, although very rarely, stopping the services seems to hang. When I 
send the beam process a SIGUSR1 I can always see two processes being in 
"Running" state:
1) a supervisor in supervisor:do_terminate/2 (any of the present supervisors - 
not always the same!)
2) a child/worker of this supervisor handling a message (or at least, so it 
seems)

Their stacktraces seem inconclusive, please find an example of the two 
processes from the crashdump_viewer below.


Is there any known problem/change in Erlang 18 that could have caused this?


Regards
Nico


1) supervisor:
'0x00002b6c8aec69a0'
"Return addr 0x8942AFD8 (supervisor:do_terminate/2 + 240)"
y0
<0.6955.0>

'0x00002b6c8aec69b0'
"Return addr 0x89423D28 (supervisor:handle_call/3 + 2408)"
y0
{<0.6951.0>,sup_wpool}
y1
permanent
y2
{child,<0.6955.0>,
       {wpool_w,2},
       {wpool_worker,start_link,[...]},
       permanent,brutal_kill,worker,...}

'0x00002b6c8aec69d0'
"Return addr 0x89405070 (gen_server:try_handle_call/4 + 176)"
y0
{state,{<0.6951.0>,sup_wpool},
       one_for_one,
       [{child,[...],...},{child,...}],
       undefined,10,1,...}
y1
one_for_one
y2
{child,<0.6955.0>,
       {wpool_w,2},
       {wpool_worker,start_link,[...]},
       permanent,brutal_kill,worker,...}

'0x00002b6c8aec69f0'
"Return addr 0x894056B8 (gen_server:handle_msg/5 + 192)"
y0
"Catch 0x894050D0 (gen_server:try_handle_call/4 + 272)"

'0x00002b6c8aec6a00'
"Return addr 0x7F6AF758 (proc_lib:init_p_do_apply/3 + 56)"
y0
supervisor
y1
{state,{<0.6951.0>,sup_wpool},
       one_for_one,
       [{child,[...],...},{child,...}],
       undefined,10,1,...}
y2
<0.6951.0>
y3
<0.6941.0>
y4
{terminate_child,{wpool_w,2}}
y5
{<0.7140.0>,#Ref<10689.0.3.14414>}

'0x00002b6c8aec6a38'
"Return addr 0x9014E8 (<terminate process normally>)"
y0
"Catch 0x7F6AF778 (proc_lib:init_p_do_apply/3 + 88)"


2) child worker:
'0x00002b6c8aec39d0'
60000

'0x00002b6c8aec39d8'
"Return addr 0x9014E8 (<terminate process normally>)"
y0
{wpool_worker,#Fun<wpool_worker.on.2>,[],false,[],false,unknown,...}
Click to expand above term
y1
[...(Incomplete Heap)]

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part.
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150711/44aeaf06/attachment.bin>


More information about the erlang-questions mailing list