[erlang-bugs] SSL connection upgrade hangs when timeout is specified.
Daniel Barney
dan353hehe@REDACTED
Tue Jan 15 19:41:50 CET 2013
Hey Guys,
I'm running R15B03
So i have noticed that the ssl upgrade process hangs even when I have
specified a timeout. After about 2 days, I have 2000+ processes that
sit around and are just waiting, and they never seem to finish the
upgrade process or timeout.
Here is how I am upgrading a tcp to an ssl socket:
%% looks pretty standard to me
%% Socket is a gen_tcp socket
Cert = "cert.crt",
Key = "key.key",
CaCert = "cacert.ca.crt",
Certs = [
{certfile, Cert}
,{keyfile, Key}
,{cacertfile,CaCert}],
ssl:ssl_accept(Socket,[{active,false},{verify, verify_none}] ++
Certs,10000) %% this line doesn't timeout, but not all the time.
Here is what erlang:process_display/2 shows for both processes, this
is just to show where the pair of processes are stuck. I also removed
all sensitive data:
Program counter: 0x00007fc236450618 (gen:do_call/4 + 576)
CP: 0x0000000000000000 (invalid)
arity = 0
0x00007fc21f1c8360 Return addr 0x00007fc2338fa480
(gen_fsm:sync_send_all_state_event/3 + 128)
y(0) #Ref<0.0.62.36895>
y(1) 'baker@REDACTED'
y(2) []
y(3) infinity
y(4) {start,10000}
y(5) '$gen_sync_all_state_event'
y(6) <0.4415.47>
0x00007fc21f1c83a0 Return addr 0x00007fc232730d80
(ssl_connection:sync_send_all_state_event/2 + 80)
y(0) infinity
y(1) {start,10000}
y(2) <0.4415.47>
y(3) Catch 0x00007fc2338fa480 (gen_fsm:sync_send_all_state_event/3 + 128)
0x00007fc21f1c83c8 Return addr 0x00007fc2327242b0
(ssl_connection:handshake/2 + 200)
y(0) Catch 0x00007fc232730da0
(ssl_connection:sync_send_all_state_event/2 + 112)
0x00007fc21f1c83d8 Return addr 0x00007fc23272ee60
(ssl_connection:start_fsm/8 + 776)
0x00007fc21f1c83e0 Return addr 0x00007fc2327240f0
(ssl_connection:ssl_accept/6 + 152)
y(0) []
y(1) []
y(2) {sslsocket,new_ssl,<0.4415.47>}
y(3) Catch 0x00007fc23272eeb8 (ssl_connection:start_fsm/8 + 864)
0x00007fc21f1c8408 Return addr 0x00007fc2320e51a0
(baker_https:encryptConnection/3 + 472)
y(0) Catch 0x00007fc232724110 (ssl_connection:ssl_accept/6 + 184)
0x00007fc21f1c8418 Return addr 0x000000000087f8a8 (<terminate process normally>)
y(0) Catch 0x00007fc2320e54c0 (baker_https:encryptConnection/3 + 1272)
y(1) baker_http
y(2) []
y(3) [{cert_ip,{_,_,_,_}}] % removed by daniel
y(4) []
Program counter: 0x00007fc2338fbd38 (gen_fsm:loop/7 + 280)
CP: 0x0000000000000000 (invalid)
arity = 0
0x00007fc21d842fa0 Return addr 0x00007fc233aea768
(proc_lib:init_p_do_apply/3 + 56)
y(0) []
y(1) infinity
y(2) ssl_connection
y(3) {state,server,{#Ref<0.0.62.36893>,<0.4414.47>},gen_tcp,tcp,tcp_closed,tcp_error,"localhost",443,#Port<0.3254116>,{ssl_options,[],verify_none,{#Fun<ssl.1.127417028>,[]},false,false,undefined,1,<<43
bytes>>,undefined,<<43 bytes>>,undefined,undefined,undefined,<<46
bytes>>,undefined,undefined,[<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2
bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2
bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2
bytes>>],#Fun<ssl.0.127417028>,true,268435456,false,[],undefined,false},{socket_options,binary,0,0,0,false},{connection_states,{connection_state,{security_parameters,<<2
bytes>>,0,0,0,0,0,0,0,0,0,0,0,undefined,undefined,undefined,undefined},undefined,undefined,undefined,1,true,undefined,undefined},{connection_state,{security_parameters,<<2
bytes>>,0,7,1,16,256,32,unknown,2,4711,20,0,undefined,<<32
bytes>>,<<32 bytes>>,undefined},undefined,undefined,undefined,undefined,true,undefined,undefined},{connection_state,{security_parameters,<<2
bytes>>,0,0,0,0,0,0,0,0,0,0,0,undefined,undefined,undefined,undefined},undefined,undefined,undefined,4,true,undefined,undefined},{connection_state,{security_parameters,<<2
bytes>>,0,7,1,16,256,32,unknown,2,4711,20,0,undefined,<<32
bytes>>,<<32 bytes>>,undefined},undefined,undefined,undefined,undefined,true,undefined,undefined}},[],<<0
bytes>>,<<0 bytes>>,{[[14,<<3 bytes>>,<<0 bytes>>],[12,<<3
bytes>>,<<522 bytes>>],[11,<<3 bytes>>,<<2243 bytes>>],[2,<<3
bytes>>,<<77 bytes>>],<<221 bytes>>],[[12,<<3 bytes>>,<<522
bytes>>],[11,<<3 bytes>>,<<2243 bytes>>],[2,<<3 bytes>>,<<77
bytes>>],<<221 bytes>>]},[],12305,{session,<<32
bytes>>,undefined,<<1249 bytes>>,0,<<2
bytes>>,undefined,new,63525415061},24596,ssl_session_cache,{3,1},undefined,false,dhe_rsa,{md5sha,rsa},undefined,{'RSAPrivateKey','two-prime',REMOVED
PRIME KEY STUFF ,asn1_NOVALUE},{'DHParameter',REMOVED DH
PARAMETERS,2,asn1_NOVALUE},{<<132 bytes>>,<<132
bytes>>},undefined,16402,#Ref<0.0.62.1224>,0,<<0
bytes>>,true,{false,first},{<0.4414.47>,#Ref<0.0.62.36895>},{[],[]},false,true}
y(4) certify
y(5) <0.4415.47>
y(6) <0.536.0>
0x00007fc21d842fe0 Return addr 0x000000000087f8a8 (<terminate process normally>)
y(0) Catch 0x00007fc233aea788 (proc_lib:init_p_do_apply/3 + 88)
And here is what erlang:process_info shows:
%% the gen_fsm:
[{current_function,{gen_fsm,loop,7}},
{initial_call,{proc_lib,init_p,5}},
{status,waiting},
{message_queue_len,0},
{messages,[]},
{links,[<0.536.0>,#Port<0.3254116>]},
{dictionary,[{ssl_manager,ssl_manager},
{'$ancestors',[ssl_connection_sup,ssl_sup,<0.533.0>]},
{'$initial_call',{ssl_connection,init,1}}]},
{trap_exit,false},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.532.0>},
{total_heap_size,13530},
{heap_size,6765},
{stack_size,10},
{reductions,16537},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,20}]},
{suspending,[]}]
%% the process requesting the upgrade:
[{current_function,{gen,do_call,4}},
{initial_call,{baker_https,clientName,3}},
{status,waiting},
{message_queue_len,0},
{messages,[]},
{links,[<0.867.0>]},
{dictionary,[]},
{trap_exit,false},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.548.0>},
{total_heap_size,1974},
{heap_size,987},
{stack_size,29},
{reductions,964},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,2}]},
{suspending,[]}]
As far as i can tell, the process that requests the connection be
upgrades drops down into a gen_fsm:sync_send_all_state_event that has
a timeout of infinity. The only problem seems to be that the
ssl_connection fsm never receives or never replies to the request, as
its message queue is 0. At least that is what I am seeing, I do not
know what causes the behaviour as it only happens on a live server,
and I haven't been able to replicate it in any tests that I have tried
to write. So i'm not sure why it is happening.
Does anyone know where i could start looking to debug this issue further?
Daniel
More information about the erlang-bugs
mailing list