[erlang-questions] Processes hanging in prim_inet:close_port/1
Björn-Egil Dahlberg
wallentin.dahlberg@REDACTED
Sat Dec 7 05:30:11 CET 2013
We are aware of this problem and I think there is a fix to R16B03. I don't
see it on maint on GitHub yet.
The runtime forces the port to unlink during port_close in some cases which
causes this problem.
// Björn-Egil
2013/12/6 Brandon Clark <a.brandon.clark@REDACTED>
> Greetings!
>
> I have an application where, once in a while, processes appear to hang in
> prim_inet:close_port/1. I'm not sure what to do with that.
>
> The application uses R16B02.
>
> Here's a sample of the process info for a "stuck" process:
>
> [{current_stacktrace,[{prim_inet,close_port,1,[]},
> {inet,tcp_close,1,[{file,"inet.erl"},{line,1422}]},
> {proxy_worker,disconnect_endpoint,1,
>
> [{file,"src/proxy_worker.erl"},{line,1082}]},
> {proxy_worker,process_chunked_reply,1,
>
> [{file,"src/proxy_worker.erl"},{line,496}]},
> {proxy_worker,process_request,2,
>
> [{file,"src/proxy_worker.erl"},{line,100}]},
> {proxy_worker,work_request,2,
>
> [{file,"src/proxy_worker.erl"},{line,87}]},
> {pool_worker,handle_cast,2,
>
> [{file,"src/pool_worker.erl"},{line,73}]},
> {gen_server,handle_msg,5,
>
> [{file,"gen_server.erl"},{line,604}]}]},
> {current_function,{prim_inet,close_port,1}},
> {initial_call,{proc_lib,init_p,5}},
> {status,waiting},
> {message_queue_len,1},
> {messages,[{tcp_closed,#Port<9492.545567>}]},
> {links,[<9492.10266.0>]},
> {dictionary,[{'$ancestors',[<9492.10266.0>,<9492.10263.0>,
> cellophane_sup,<9492.114.0>]},
> {'$initial_call',{pool_worker,init,1}}]},
> {trap_exit,true},
> {error_handler,error_handler},
> {priority,normal},
> {group_leader,<9492.113.0>},
> {total_heap_size,2586},
> {heap_size,2586},
> {stack_size,44},
> {reductions,4490210},
> {garbage_collection,[{min_bin_vheap_size,46422},
> {min_heap_size,233},
> {fullsweep_after,0},
> {minor_gcs,0}]},
> {suspending,[]}]}]
>
>
> Apparently, this condition occurs -- intermittently -- when the client
> (the process described above) and the server (a different app and platform)
> decide to close their socket connection at about the same time. Notice the
> tcp_closed message in the inbox.
>
> The socket in question is in {active, once} mode.
>
> The {status,waiting} and {trap_exit, true} suggest to me that the process
> is stuck in the receive at line 210 of prim_inet:
>
> 188 close_port(S) ->
> 189 case erlang:process_info(self(), trap_exit) of
> 190 {trap_exit,true} ->
> 191 %% Ensure exit message and consume it
> 192 link(S),
> 193 %% This is still not a perfect solution.
> 194 %%
> 195 %% The problem is to close the port and consume any exit
> 196 %% message while not knowing if this process traps exit
> 197 %% nor if this process has a link to the port. Here we
> 198 %% just knows that this process traps exit.
> 199 %%
> 200 %% If we right here get killed for some reason that exit
> 201 %% signal will propagate to the port and onwards to anyone
> 202 %% that is linked to the port. E.g when we close a socket
> 203 %% that is not ours.
> 204 %%
> 205 %% The problem can be solved with lists:member on our link
> 206 %% list but we deem that as potentially too expensive. We
> 207 %% need an is_linked/1 function or guard, or we need
> 208 %% a port_close function that can atomically unlink...
> 209 catch erlang:port_close(S),
> 210 receive {'EXIT',S,_} -> ok end;
> 211 {trap_exit,false} ->
> 212 catch erlang:port_close(S),
> 213 ok
> 214 end.
>
>
> But now I'm speculating. I'm in over my head, here.
>
> Any suggestions on how to clear up these stuck processes would be greatly
> appreciated!
>
> ~BC
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20131207/6d5c5b4e/attachment.htm>
More information about the erlang-questions
mailing list