[erlang-bugs] SSL connection upgrade hangs when timeout is specified.

Daniel Barney dan353hehe@REDACTED
Tue Jan 15 19:41:50 CET 2013


Hey Guys,

I'm running R15B03

So i have noticed that the ssl upgrade process hangs even when I have
specified a timeout. After about 2 days, I have 2000+ processes that
sit around and are just waiting, and they never seem to finish the
upgrade process or timeout.

Here is how I am upgrading a tcp to an ssl socket:

%% looks pretty standard to me
%% Socket is a gen_tcp socket

Cert = "cert.crt",
Key = "key.key",
CaCert = "cacert.ca.crt",
Certs = [
		{certfile, Cert}
		,{keyfile, Key}
		,{cacertfile,CaCert}],

ssl:ssl_accept(Socket,[{active,false},{verify, verify_none}] ++
Certs,10000) %% this line doesn't timeout, but not all the time.

Here is what erlang:process_display/2 shows for both processes, this
is just to show where the pair of processes are stuck. I also removed
all sensitive data:

Program counter: 0x00007fc236450618 (gen:do_call/4 + 576)
CP: 0x0000000000000000 (invalid)
arity = 0

0x00007fc21f1c8360 Return addr 0x00007fc2338fa480
(gen_fsm:sync_send_all_state_event/3 + 128)
y(0)     #Ref<0.0.62.36895>
y(1)     'baker@REDACTED'
y(2)     []
y(3)     infinity
y(4)     {start,10000}
y(5)     '$gen_sync_all_state_event'
y(6)     <0.4415.47>

0x00007fc21f1c83a0 Return addr 0x00007fc232730d80
(ssl_connection:sync_send_all_state_event/2 + 80)
y(0)     infinity
y(1)     {start,10000}
y(2)     <0.4415.47>
y(3)     Catch 0x00007fc2338fa480 (gen_fsm:sync_send_all_state_event/3 + 128)

0x00007fc21f1c83c8 Return addr 0x00007fc2327242b0
(ssl_connection:handshake/2 + 200)
y(0)     Catch 0x00007fc232730da0
(ssl_connection:sync_send_all_state_event/2 + 112)

0x00007fc21f1c83d8 Return addr 0x00007fc23272ee60
(ssl_connection:start_fsm/8 + 776)

0x00007fc21f1c83e0 Return addr 0x00007fc2327240f0
(ssl_connection:ssl_accept/6 + 152)
y(0)     []
y(1)     []
y(2)     {sslsocket,new_ssl,<0.4415.47>}
y(3)     Catch 0x00007fc23272eeb8 (ssl_connection:start_fsm/8 + 864)

0x00007fc21f1c8408 Return addr 0x00007fc2320e51a0
(baker_https:encryptConnection/3 + 472)
y(0)     Catch 0x00007fc232724110 (ssl_connection:ssl_accept/6 + 184)

0x00007fc21f1c8418 Return addr 0x000000000087f8a8 (<terminate process normally>)
y(0)     Catch 0x00007fc2320e54c0 (baker_https:encryptConnection/3 + 1272)
y(1)     baker_http
y(2)     []
y(3)     [{cert_ip,{_,_,_,_}}] % removed by daniel
y(4)     []





Program counter: 0x00007fc2338fbd38 (gen_fsm:loop/7 + 280)
CP: 0x0000000000000000 (invalid)
arity = 0

0x00007fc21d842fa0 Return addr 0x00007fc233aea768
(proc_lib:init_p_do_apply/3 + 56)
y(0)     []
y(1)     infinity
y(2)     ssl_connection
y(3)     {state,server,{#Ref<0.0.62.36893>,<0.4414.47>},gen_tcp,tcp,tcp_closed,tcp_error,"localhost",443,#Port<0.3254116>,{ssl_options,[],verify_none,{#Fun<ssl.1.127417028>,[]},false,false,undefined,1,<<43
bytes>>,undefined,<<43 bytes>>,undefined,undefined,undefined,<<46
bytes>>,undefined,undefined,[<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2
bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2
bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2
bytes>>],#Fun<ssl.0.127417028>,true,268435456,false,[],undefined,false},{socket_options,binary,0,0,0,false},{connection_states,{connection_state,{security_parameters,<<2
bytes>>,0,0,0,0,0,0,0,0,0,0,0,undefined,undefined,undefined,undefined},undefined,undefined,undefined,1,true,undefined,undefined},{connection_state,{security_parameters,<<2
bytes>>,0,7,1,16,256,32,unknown,2,4711,20,0,undefined,<<32
bytes>>,<<32 bytes>>,undefined},undefined,undefined,undefined,undefined,true,undefined,undefined},{connection_state,{security_parameters,<<2
bytes>>,0,0,0,0,0,0,0,0,0,0,0,undefined,undefined,undefined,undefined},undefined,undefined,undefined,4,true,undefined,undefined},{connection_state,{security_parameters,<<2
bytes>>,0,7,1,16,256,32,unknown,2,4711,20,0,undefined,<<32
bytes>>,<<32 bytes>>,undefined},undefined,undefined,undefined,undefined,true,undefined,undefined}},[],<<0
bytes>>,<<0 bytes>>,{[[14,<<3 bytes>>,<<0 bytes>>],[12,<<3
bytes>>,<<522 bytes>>],[11,<<3 bytes>>,<<2243 bytes>>],[2,<<3
bytes>>,<<77 bytes>>],<<221 bytes>>],[[12,<<3 bytes>>,<<522
bytes>>],[11,<<3 bytes>>,<<2243 bytes>>],[2,<<3 bytes>>,<<77
bytes>>],<<221 bytes>>]},[],12305,{session,<<32
bytes>>,undefined,<<1249 bytes>>,0,<<2
bytes>>,undefined,new,63525415061},24596,ssl_session_cache,{3,1},undefined,false,dhe_rsa,{md5sha,rsa},undefined,{'RSAPrivateKey','two-prime',REMOVED
PRIME KEY STUFF ,asn1_NOVALUE},{'DHParameter',REMOVED DH
PARAMETERS,2,asn1_NOVALUE},{<<132 bytes>>,<<132
bytes>>},undefined,16402,#Ref<0.0.62.1224>,0,<<0
bytes>>,true,{false,first},{<0.4414.47>,#Ref<0.0.62.36895>},{[],[]},false,true}
y(4)     certify
y(5)     <0.4415.47>
y(6)     <0.536.0>

0x00007fc21d842fe0 Return addr 0x000000000087f8a8 (<terminate process normally>)
y(0)     Catch 0x00007fc233aea788 (proc_lib:init_p_do_apply/3 + 88)


And here is what erlang:process_info shows:

%% the gen_fsm:
[{current_function,{gen_fsm,loop,7}},
 {initial_call,{proc_lib,init_p,5}},
 {status,waiting},
 {message_queue_len,0},
 {messages,[]},
 {links,[<0.536.0>,#Port<0.3254116>]},
 {dictionary,[{ssl_manager,ssl_manager},
              {'$ancestors',[ssl_connection_sup,ssl_sup,<0.533.0>]},
              {'$initial_call',{ssl_connection,init,1}}]},
 {trap_exit,false},
 {error_handler,error_handler},
 {priority,normal},
 {group_leader,<0.532.0>},
 {total_heap_size,13530},
 {heap_size,6765},
 {stack_size,10},
 {reductions,16537},
 {garbage_collection,[{min_bin_vheap_size,46368},
                      {min_heap_size,233},
                      {fullsweep_after,65535},
                      {minor_gcs,20}]},
 {suspending,[]}]


%% the process requesting the upgrade:
[{current_function,{gen,do_call,4}},
 {initial_call,{baker_https,clientName,3}},
 {status,waiting},
 {message_queue_len,0},
 {messages,[]},
 {links,[<0.867.0>]},
 {dictionary,[]},
 {trap_exit,false},
 {error_handler,error_handler},
 {priority,normal},
 {group_leader,<0.548.0>},
 {total_heap_size,1974},
 {heap_size,987},
 {stack_size,29},
 {reductions,964},
 {garbage_collection,[{min_bin_vheap_size,46368},
                      {min_heap_size,233},
                      {fullsweep_after,65535},
                      {minor_gcs,2}]},
 {suspending,[]}]

As far as i can tell, the process that requests the connection be
upgrades drops down into a gen_fsm:sync_send_all_state_event that has
a timeout of infinity. The only problem seems to be that the
ssl_connection fsm never receives or never replies to the request, as
its message queue is 0. At least that is what I am seeing, I do not
know what causes the behaviour as it only happens on a live server,
and I haven't been able to replicate it in any tests that I have tried
to write. So i'm not sure why it is happening.

Does anyone know where i could start looking to debug this issue further?
Daniel



More information about the erlang-bugs mailing list