[erlang-bugs] [erlang-questions] Process/FD leak in SSL R15B01

Ingela Anderton Andin <>
Wed Oct 24 14:59:25 CEST 2012


Hi!

Loïc Hoguin wrote:
> Hey,
> 
> I would like to try but this patch isn't correct for R15B01. :)
> 
> handle_trusted_certs_db doesn't seem to exist.

Sorry it was based on master. It should work for R15B01 if you
remove the  handle_trusted_certs_db(State), line.

Regards Ingela Erlang/OTP team


> 
> On 10/24/2012 11:24 AM, Ingela Anderton Andin wrote:
>> Hi!
>>
>> Loïc Hoguin wrote:
>>> This doesn't make a difference so far.
>>
>> This would only make a differnce if you do not set active
>> explicitly.
>>
>> Anyway I have a theory that perhaps the inet driver can hang
>> if you try to do recv on a socket that has been shutdown for
>> writing from the other side or maybe some strange race condition.
>> I have no evidence that this is so, for now I just think it fits the
>> scenario of what seems to happen.
>>
>> I think that ssl should have a new terminate clause that
>> avoids doing socket operations that are logically not necessary, even if
>> it can be considered a bug that the inet driver hangs.
>>
>> So first step try this patch and see if your problem goes away. If yes
>> that is the solution for you and the ssl-application and we will have
>> to try and pinpoint the actual problem in the inet driver and fix that.
>>
>>
>> diff --git a/lib/ssl/src/ssl_connection.erl
>> b/lib/ssl/src/ssl_connection.erl
>> index 1319b54..c9c162b 100644
>> --- a/lib/ssl/src/ssl_connection.erl
>> +++ b/lib/ssl/src/ssl_connection.erl
>> @@ -984,7 +984,7 @@ handle_info({CloseTag, Socket}, StateName,
>>              ok
>>       end,
>>       handle_normal_shutdown(?ALERT_REC(?FATAL, ?CLOSE_NOTIFY),
>> StateName, State),
>> -    {stop, normal, State};
>> +    {stop, {shutdown, transport_closed}, State};
>>
>>   handle_info({ErrorTag, Socket, econnaborted}, StateName,
>>              #state{socket = Socket, start_or_recv_from = StartFrom,
>> role = Role,
>> @@ -1022,6 +1022,14 @@ terminate(_, _, #state{terminated = true}) ->
>>       %% we want to guarantee that Transport:close has been called
>>       %% when ssl:close/1 returns.
>>       ok;
>> +
>> +terminate({shutdown, transport_closed}, _, #state{negotiated_version =
>> Version,
>> +                                                 send_queue = SendQueue,
>> +                                                 renegotiation =
>> Renegotiate} = State) ->
>> +    handle_trusted_certs_db(State),
>> +    notify_senders(SendQueue),
>> +    notify_renegotiater(Renegotiate);
>> +
>>   terminate(Reason, connection, #state{negotiated_version = Version,
>>                                        connection_states =
>> ConnectionStates,
>>                                        transport_cb = Transport,
>>
>>
>> Regards Ingela Erlang/OTP team - Ericsson AB
>>
>>
>>> On 10/17/2012 09:51 AM, Ingela Anderton Andin wrote:
>>>> Hi!
>>>>
>>>> My problem goes away with the following patch
>>>>
>>>> diff --git a/lib/ssl/src/ssl.erl b/lib/ssl/src/ssl.erl
>>>> index 7788f75..771bfa5 100644
>>>> --- a/lib/ssl/src/ssl.erl
>>>> +++ b/lib/ssl/src/ssl.erl
>>>> @@ -869,10 +869,10 @@ internal_inet_values() ->
>>>>
>>>> socket_options(InetValues) ->
>>>>      #socket_options{
>>>> -               mode   = proplists:get_value(mode, InetValues),
>>>> -               header = proplists:get_value(header, InetValues),
>>>> -               active = proplists:get_value(active, InetValues),
>>>> -               packet = proplists:get_value(packet, InetValues),
>>>> +               mode   = proplists:get_value(mode, InetValues, lists),
>>>> +               header = proplists:get_value(header, InetValues, 0),
>>>> +               active = proplists:get_value(active, InetValues,
>>>> active),
>>>> +               packet = proplists:get_value(packet, InetValues, 0),
>>>>                 packet_size = proplists:get_value(packet_size,
>>>> InetValues)
>>>>                }.
>>>>
>>>>
>>>> e.i.  default values where not properly handled.  I know to  little
>>>> about  your configuration to say if  this is your problem too.  If not
>>>> it would be great if you could
>>>> give me a way to recreate your problem.
>>>>
>>>> Regards Ingela Erlang/OTP team - Ericsson AB
>>>>
>>>>
>>>> Ingela Anderton Andin wrote:
>>>>> Hi!
>>>>>
>>>>> This is puzzling. Links seems to be intact. And the supervisor should
>>>>> have killed the gen_fsm-process if it gets stuck in terminate.
>>>>>
>>>>> I tried to recreate your problem, I did get a process leak problem,
>>>>> however it did not manifest itself in quite the same way as yours.
>>>>>
>>>>> In my case I have an active process that seems to not have received
>>>>> the tcp_close message. The fsm procss emulates active option as it
>>>>> uses active once to receive TLS packets. If I set the active option
>>>>> the process will terminate. At the moment I am have not found the root
>>>>> of why it is not working as expected e.i. if it is the emulating code
>>>>> that does something wrong or it perhaps is the inet driver.  Will have
>>>>> to keep digging.
>>>>>
>>>>> Regards Ingela Erlang/OTP team - Ericsson AB
>>>>>
>>>>>
>>>>> Loïc Hoguin wrote:
>>>>>> 103> erlang:port_info(Port).
>>>>>> [{name,"tcp_inet"},
>>>>>>  {links,[<0.18199.1670>]},
>>>>>>  {id,51824890},
>>>>>>  {connected,<0.18199.1670>},
>>>>>>  {input,0},
>>>>>>  {output,3583}]
>>>>>> 104> Pid.
>>>>>> <0.18199.1670>
>>>>>>
>>>>>> On 10/16/2012 11:55 AM, Ingela Anderton Andin wrote:
>>>>>>> Hi!
>>>>>>>
>>>>>>> Ok, next question can you do a port_info on the linked port?
>>>>>>>
>>>>>>> Regards Ingela Erlang/OTP Team - Ericsson AB
>>>>>>>
>>>>>>> Loïc Hoguin wrote:
>>>>>>>> Hey,
>>>>>>>>
>>>>>>>> Here's one:
>>>>>>>>
>>>>>>>> [{current_function,{prim_inet,recv0,3}},
>>>>>>>>  {initial_call,{proc_lib,init_p,5}},
>>>>>>>>  {status,waiting},
>>>>>>>>  {message_queue_len,2},
>>>>>>>>  {messages,[{system,{<0.1523.2358>,#Ref<0.0.9161.247946>},
>>>>>>>>                     get_status},
>>>>>>>> {system,{<0.19941.2364>,#Ref<0.0.9166.119462>},get_status}]},
>>>>>>>>  {links,[<0.897.0>,#Port<0.51824890>]},
>>>>>>>>  {dictionary,[{ssl_manager,ssl_manager},
>>>>>>>>               
>>>>>>>> {'$ancestors',[ssl_connection_sup,ssl_sup,<0.894.0>]},
>>>>>>>>               {'$initial_call',{ssl_connection,init,1}}]},
>>>>>>>>  {trap_exit,false},
>>>>>>>>  {error_handler,error_handler},
>>>>>>>>  {priority,normal},
>>>>>>>>  {group_leader,<0.893.0>},
>>>>>>>>  {total_heap_size,10946},
>>>>>>>>  {heap_size,4181},
>>>>>>>>  {stack_size,21},
>>>>>>>>  {reductions,8272},
>>>>>>>>  {garbage_collection,[{min_bin_vheap_size,46368},
>>>>>>>>                       {min_heap_size,233},
>>>>>>>>                       {fullsweep_after,10},
>>>>>>>>                       {minor_gcs,1}]},
>>>>>>>>  {suspending,[]}]
>>>>>>>>
>>>>>>>> The two get_status were me trying to inspect and getting a timeout.
>>>>>>>>
>>>>>>>> Will try commenting the function, that was my guess also. Doesn't
>>>>>>>> explain the other half of the processes though which still seem
>>>>>>>> to be
>>>>>>>> running happily despite the process owning the socket being dead 
>>>>>>>> for
>>>>>>>> days.
>>>>>>>>
>>>>>>>> On 10/16/2012 11:18 AM, Ingela Anderton Andin wrote:
>>>>>>>>> Hi!
>>>>>>>>>
>>>>>>>>> This sounds really strange it would be interesting to see all
>>>>>>>>> process_info available for the process.
>>>>>>>>>
>>>>>>>>> Something you could try is to comment out the invocation of the
>>>>>>>>> function
>>>>>>>>> workaround_transport_delivery_problems in the terminate function
>>>>>>>>> of the
>>>>>>>>>   ssl_connection-process. This function can call recv(S, 0) and
>>>>>>>>> sounds
>>>>>>>>> like the probable recv that hangs even though it should not.
>>>>>>>>>
>>>>>>>>> Regards Ingela Erlang/OTP team - Ericsson AB
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Loïc Hoguin wrote:
>>>>>>>>>> On 10/15/2012 06:09 PM, Attila Rajmund Nohl wrote:
>>>>>>>>>>> 2012/10/15 Loïc Hoguin <>:
>>>>>>>>>>> [...]
>>>>>>>>>>>>> lists:foldl(fun(X, Sum) -> case erlang:process_info(X) of
>>>>>>>>>>>>> undefined ->
>>>>>>>>>>>>> Sum; [{current_function, XXX}|_] -> case lists:keyfind(XXX, 1,
>>>>>>>>>>>>> Sum)
>>>>>>>>>>>>> of false
>>>>>>>>>>>>> -> Curr = 0; {_, Curr} -> ok end, lists:keystore(XXX, 1, Sum,
>>>>>>>>>>>>> {XXX,
>>>>>>>>>>>>> Curr +
>>>>>>>>>>>>> 1}) end end, [], List).
>>>>>>>>>>>> [{{prim_inet,recv0,3},25856},{{gen_fsm,loop,7},26574}]
>>>>>>>>>>>>
>>>>>>>>>>>> Not sure which one is the ESTABLISHED list and which one is the
>>>>>>>>>>>> FIN_WAIT2.
>>>>>>>>>>>> Of course, I can't use sys:get_status/1 on the PIDs stuck in
>>>>>>>>>>>> prim_inet:recv0/3 because the receive there is quite specific.
>>>>>>>>>>>> So I
>>>>>>>>>>>> can't
>>>>>>>>>>>> get the stacktrace. The other case doesn't seem to give 
>>>>>>>>>>>> anything
>>>>>>>>>>>> useful (for
>>>>>>>>>>>> my level of knowledge, anyway).
>>>>>>>>>>>
>>>>>>>>>>> You can get the stacktrace with erlang:process_info(Pid,
>>>>>>>>>>> backtrace).
>>>>>>>>>>
>>>>>>>>>> Thanks for the tip!
>>>>>>>>>>
>>>>>>>>>> So yeah, this one is stuck while trying to terminate.
>>>>>>>>>>
>>>>>>>>>> Program counter: 0x00007f05fd6a5608 (prim_inet:recv0/3 + 224)
>>>>>>>>>> CP: 0x0000000000000000 (invalid)
>>>>>>>>>> arity = 0
>>>>>>>>>>
>>>>>>>>>> 0x00007f052e1b1eb0 Return addr 0x00007f05a3248a98
>>>>>>>>>> (ssl_connection:terminate/3 + 800)
>>>>>>>>>> y(0)     57928
>>>>>>>>>> y(1)     #Port<0.51824890>
>>>>>>>>>>
>>>>>>>>>> 0x00007f052e1b1ec8 Return addr 0x00007f05a3b29670
>>>>>>>>>> (gen_fsm:terminate/7
>>>>>>>>>> + 168)
>>>>>>>>>> y(0)     []
>>>>>>>>>> y(1)     []
>>>>>>>>>> y(2)     []
>>>>>>>>>> y(3)     []
>>>>>>>>>> y(4)     #Port<0.51824890>
>>>>>>>>>> y(5)     gen_tcp
>>>>>>>>>>
>>>>>>>>>> 0x00007f052e1b1f00 Return addr 0x00007f05a3bb41d0
>>>>>>>>>> (proc_lib:init_p_do_apply/3 + 56)
>>>>>>>>>> y(0)     []
>>>>>>>>>> y(1)
>>>>>>>>>> {state,server,{#Ref<0.0.8553.184512>,<0.18913.1670>},gen_tcp,tcp,tcp_closed,tcp_error,"localhost",8443,#Port<0.51824890>,{ssl_options,[],verify_none,{#Fun<ssl.1.54384637>,[]},false,false,undefined,1,"/home/obfuscated/obfuscatedrlang/lib/obfuscatedunchat-1.6.0/priv/ssl/cert.pem",undefined,"/home/obfuscated/obfuscatedrlang/lib/obfuscatedunchat-1.6.0/priv/ssl/cert.pem",undefined,undefined,undefined,"/home/obfuscated/obfuscatedrlang/lib/obfuscatedunchat-1.6.0/priv/ssl/cacert.pem",undefined,undefined,[<<2 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2
>>>>>>>>>> bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2
>>>>>>>>>> bytes>>,<<2 bytes>>,<<2
>>>>>>>>>> bytes>>],#Fun<ssl.0.54384637>,true,268435456,false,[],undefined,false},{socket_options,binary,0,0,0,once},{connection_states,{connection_state,{security_parameters,<<2 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> bytes>>,0,7,1,16,128,16,unknown,2,20,0,<<48 bytes>>,<<32
>>>>>>>>>> bytes>>,<<32
>>>>>>>>>> bytes>>,undefined},undefined,{cipher_state,<<16 bytes>>,<<16
>>>>>>>>>> bytes>>,undefined},<<20 bytes>>,6,true,<<12 bytes>>,<<12
>>>>>>>>>> bytes>>},{connection_state,{security_parameters,undefined,0,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,<<32 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> bytes>>,undefined},undefined,undefined,undefined,undefined,true,undefined,undefined},{connection_state,{security_parameters,<<2 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> bytes>>,0,7,1,16,128,16,unknown,2,20,0,<<48 bytes>>,<<32
>>>>>>>>>> bytes>>,<<32
>>>>>>>>>> bytes>>,undefined},undefined,{cipher_state,<<16 bytes>>,<<16
>>>>>>>>>> bytes>>,undefined},<<20 bytes>>,13,true,<<12 bytes>>,<<12
>>>>>>>>>> bytes>>},{connection_state,{security_parameters,undefined,0,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,<<32 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> bytes>>,undefined},undefined,undefined,undefined,undefined,true,undefined,undefined}},[],<<0 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> bytes>>,<<0 bytes>>,{<<0 bytes>>,<<0
>>>>>>>>>> bytes>>},[],12308,{session,<<32
>>>>>>>>>> bytes>>,undefined,<<1257 bytes>>,0,<<2 bytes>>,<<48
>>>>>>>>>> bytes>>,true,63517514196},24599,ssl_session_cache,{3,1},undefined,false,rsa,undefined,{'RSAPrivateKey','two-prime',26952275589898844250103204854000460899755240864557148991279029405309749386179997816352098133722767607711339559172144166023544257819579600281768301281374192936110073288270279982220728800798557215504319561694659173220853716332492625335748497090542115981135939145850167689577529567336032326581411005966667046488818079418004669621093520594249388264789813277716494693460309931110183497107534349074298533842111855672958994036657571757894555006279600552417098362361837531438833518633912632305124934722467790401548511827982945839067677876435394001531838872958423949934302335970305331259903589271491819721745867851063767101711,65537,728945610618363694299524400013974044353817910988106666265500093396272804476922137563271036849773495653880241216909921636277708729194269519499432595651857072787205775416315535781104441756800626584698817814422133825685692237962420898234801884
32 
>>>>>>>>>>
>>>>>>>>>>
>> 467
>>>>>>>>>>
>>>>>>>>>>
>>>>> 00
>>>>>>>>>>
>>>>>>>>>>
>>>>>>> 42
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> 51893513795
>>>>>>>>>>
>>>>>>>>>> 29032795180763714863835283712338222389782464852044908949559099608779063302276133782734662944988246892889439189879440128304200546026414115176004328650285319262737051106741309180479178565669060188052206153201268137707738579437817066853295089724557953207910831295502502266942720391639060038564028714207644340973116764361586980768047522941109290140269958017673,175687943987452481712482925881452602286249755596049611520304319723586191396077353211979651292293422727226888628810410185730059847194115171473543563960899974179642474882228165413574970142069071147678024461620581964093179873537217082000749319715442644574561869121959620883833054096268955094522609955548334417459,153409932282113553454184543331491108535630917003675738115171976005561046830080906444372373873384261619709596605477733172587334100633359814892598660495299494503111498422936185897687222105679379604880102069501875829087474529350923531274442007552106843129760936560325773227109973336264866827986047857553670280629,16
53 
>>>>>>>>>>
>>>>>>>>>>
>> 778
>>>>>>>>>>
>>>>>>>>>>
>>>>> 01
>>>>>>>>>>
>>>>>>>>>>
>>>>>>> 12
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> 80640103960
>>>>>>>>>>
>>>>>>>>>> 32533996867303777118782862747708688208092956158440780252498542811490728487320772469810738971008968489145399289747151126453392805772011338882217952599871103786463875892748648418527046734444999723398379211505851743415571090610985953725319559986129544296274474768880307859595812559810466567,99372239273086723091338362047522171285756194037567212940251777246259022537354389739788150444373539745180764989177727522509078951433348961072685633082784597107680994416138776015512122203187528006871997391618377904030112283443023112892909533616156365176077807633229316646127723088806569214057154029767435353361,7323400945433254897172443884831966307350890604864415567101495881249115607826271555845873590083141172347084964771961315371798965570031721570889013199060717814694355367862624287469661994309269724835502396215672690851146331180228281917872372949595188776597872289445053324786304898777105863585382475713592590505,asn1_NOVALUE},{'DHParameter',179769313486231590770839156793787453197860296
04 
>>>>>>>>>>
>>>>>>>>>>
>> 875
>>>>>>>>>>
>>>>>>>>>>
>>>>> 60
>>>>>>>>>>
>>>>>>>>>>
>>>>>>> 11
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> 70644442368
>>>>>>>>>>
>>>>>>>>>> 4197180216158519368947833795864925541502180565485980503646440548199239100050792877003355816639229553136239076508735759914822574862575007425302077447712589550957937778424442426617334727629299387668709205606050270810842907692932019128194467627007,2,asn1_NOVALUE},undefined,undefined,#Ref<0.0.0.13264>,{<0.1092.0>,#Ref<0.0.8553.179746>},0,<<0 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> bytes>>,true,undefined,undefined,{[],[]},false,true}
>>>>>>>>>> y(2)     connection
>>>>>>>>>> y(3)     ssl_connection
>>>>>>>>>> y(4) {'DOWN',#Ref<0.0.8553.184512>,process,<0.18913.1670>,normal}
>>>>>>>>>> y(5)     <0.18199.1670>
>>>>>>>>>> y(6)     normal
>>>>>>>>>> y(7)     Catch 0x00007f05a3b29670 (gen_fsm:terminate/7 + 168)
>>>>>>>>>>
>>>>>>>>>> 0x00007f052e1b1f48 Return addr 0x0000000000883498 (<terminate
>>>>>>>>>> process
>>>>>>>>>> normally>)
>>>>>>>>>> y(0)     Catch 0x00007f05a3bb41f0 (proc_lib:init_p_do_apply/3 +
>>>>>>>>>> 88)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> erlang-bugs mailing list
>>>>> 
>>>>> http://erlang.org/mailman/listinfo/erlang-bugs
>>>>>
>>>>
>>>>
>>>
>>>
>>
> 
> 



More information about the erlang-bugs mailing list