[erlang-bugs] [erlang-questions] Process/FD leak in SSL R15B01

Loïc Hoguin essen@REDACTED
Wed Oct 24 14:14:35 CEST 2012


Hey,

I would like to try but this patch isn't correct for R15B01. :)

handle_trusted_certs_db doesn't seem to exist.

On 10/24/2012 11:24 AM, Ingela Anderton Andin wrote:
> Hi!
>
> Loïc Hoguin wrote:
>> This doesn't make a difference so far.
>
> This would only make a differnce if you do not set active
> explicitly.
>
> Anyway I have a theory that perhaps the inet driver can hang
> if you try to do recv on a socket that has been shutdown for
> writing from the other side or maybe some strange race condition.
> I have no evidence that this is so, for now I just think it fits the
> scenario of what seems to happen.
>
> I think that ssl should have a new terminate clause that
> avoids doing socket operations that are logically not necessary, even if
> it can be considered a bug that the inet driver hangs.
>
> So first step try this patch and see if your problem goes away. If yes
> that is the solution for you and the ssl-application and we will have
> to try and pinpoint the actual problem in the inet driver and fix that.
>
>
> diff --git a/lib/ssl/src/ssl_connection.erl
> b/lib/ssl/src/ssl_connection.erl
> index 1319b54..c9c162b 100644
> --- a/lib/ssl/src/ssl_connection.erl
> +++ b/lib/ssl/src/ssl_connection.erl
> @@ -984,7 +984,7 @@ handle_info({CloseTag, Socket}, StateName,
>              ok
>       end,
>       handle_normal_shutdown(?ALERT_REC(?FATAL, ?CLOSE_NOTIFY),
> StateName, State),
> -    {stop, normal, State};
> +    {stop, {shutdown, transport_closed}, State};
>
>   handle_info({ErrorTag, Socket, econnaborted}, StateName,
>              #state{socket = Socket, start_or_recv_from = StartFrom,
> role = Role,
> @@ -1022,6 +1022,14 @@ terminate(_, _, #state{terminated = true}) ->
>       %% we want to guarantee that Transport:close has been called
>       %% when ssl:close/1 returns.
>       ok;
> +
> +terminate({shutdown, transport_closed}, _, #state{negotiated_version =
> Version,
> +                                                 send_queue = SendQueue,
> +                                                 renegotiation =
> Renegotiate} = State) ->
> +    handle_trusted_certs_db(State),
> +    notify_senders(SendQueue),
> +    notify_renegotiater(Renegotiate);
> +
>   terminate(Reason, connection, #state{negotiated_version = Version,
>                                        connection_states =
> ConnectionStates,
>                                        transport_cb = Transport,
>
>
> Regards Ingela Erlang/OTP team - Ericsson AB
>
>
>> On 10/17/2012 09:51 AM, Ingela Anderton Andin wrote:
>>> Hi!
>>>
>>> My problem goes away with the following patch
>>>
>>> diff --git a/lib/ssl/src/ssl.erl b/lib/ssl/src/ssl.erl
>>> index 7788f75..771bfa5 100644
>>> --- a/lib/ssl/src/ssl.erl
>>> +++ b/lib/ssl/src/ssl.erl
>>> @@ -869,10 +869,10 @@ internal_inet_values() ->
>>>
>>> socket_options(InetValues) ->
>>>      #socket_options{
>>> -               mode   = proplists:get_value(mode, InetValues),
>>> -               header = proplists:get_value(header, InetValues),
>>> -               active = proplists:get_value(active, InetValues),
>>> -               packet = proplists:get_value(packet, InetValues),
>>> +               mode   = proplists:get_value(mode, InetValues, lists),
>>> +               header = proplists:get_value(header, InetValues, 0),
>>> +               active = proplists:get_value(active, InetValues,
>>> active),
>>> +               packet = proplists:get_value(packet, InetValues, 0),
>>>                 packet_size = proplists:get_value(packet_size,
>>> InetValues)
>>>                }.
>>>
>>>
>>> e.i.  default values where not properly handled.  I know to  little
>>> about  your configuration to say if  this is your problem too.  If not
>>> it would be great if you could
>>> give me a way to recreate your problem.
>>>
>>> Regards Ingela Erlang/OTP team - Ericsson AB
>>>
>>>
>>> Ingela Anderton Andin wrote:
>>>> Hi!
>>>>
>>>> This is puzzling. Links seems to be intact. And the supervisor should
>>>> have killed the gen_fsm-process if it gets stuck in terminate.
>>>>
>>>> I tried to recreate your problem, I did get a process leak problem,
>>>> however it did not manifest itself in quite the same way as yours.
>>>>
>>>> In my case I have an active process that seems to not have received
>>>> the tcp_close message. The fsm procss emulates active option as it
>>>> uses active once to receive TLS packets. If I set the active option
>>>> the process will terminate. At the moment I am have not found the root
>>>> of why it is not working as expected e.i. if it is the emulating code
>>>> that does something wrong or it perhaps is the inet driver.  Will have
>>>> to keep digging.
>>>>
>>>> Regards Ingela Erlang/OTP team - Ericsson AB
>>>>
>>>>
>>>> Loïc Hoguin wrote:
>>>>> 103> erlang:port_info(Port).
>>>>> [{name,"tcp_inet"},
>>>>>  {links,[<0.18199.1670>]},
>>>>>  {id,51824890},
>>>>>  {connected,<0.18199.1670>},
>>>>>  {input,0},
>>>>>  {output,3583}]
>>>>> 104> Pid.
>>>>> <0.18199.1670>
>>>>>
>>>>> On 10/16/2012 11:55 AM, Ingela Anderton Andin wrote:
>>>>>> Hi!
>>>>>>
>>>>>> Ok, next question can you do a port_info on the linked port?
>>>>>>
>>>>>> Regards Ingela Erlang/OTP Team - Ericsson AB
>>>>>>
>>>>>> Loïc Hoguin wrote:
>>>>>>> Hey,
>>>>>>>
>>>>>>> Here's one:
>>>>>>>
>>>>>>> [{current_function,{prim_inet,recv0,3}},
>>>>>>>  {initial_call,{proc_lib,init_p,5}},
>>>>>>>  {status,waiting},
>>>>>>>  {message_queue_len,2},
>>>>>>>  {messages,[{system,{<0.1523.2358>,#Ref<0.0.9161.247946>},
>>>>>>>                     get_status},
>>>>>>> {system,{<0.19941.2364>,#Ref<0.0.9166.119462>},get_status}]},
>>>>>>>  {links,[<0.897.0>,#Port<0.51824890>]},
>>>>>>>  {dictionary,[{ssl_manager,ssl_manager},
>>>>>>>               {'$ancestors',[ssl_connection_sup,ssl_sup,<0.894.0>]},
>>>>>>>               {'$initial_call',{ssl_connection,init,1}}]},
>>>>>>>  {trap_exit,false},
>>>>>>>  {error_handler,error_handler},
>>>>>>>  {priority,normal},
>>>>>>>  {group_leader,<0.893.0>},
>>>>>>>  {total_heap_size,10946},
>>>>>>>  {heap_size,4181},
>>>>>>>  {stack_size,21},
>>>>>>>  {reductions,8272},
>>>>>>>  {garbage_collection,[{min_bin_vheap_size,46368},
>>>>>>>                       {min_heap_size,233},
>>>>>>>                       {fullsweep_after,10},
>>>>>>>                       {minor_gcs,1}]},
>>>>>>>  {suspending,[]}]
>>>>>>>
>>>>>>> The two get_status were me trying to inspect and getting a timeout.
>>>>>>>
>>>>>>> Will try commenting the function, that was my guess also. Doesn't
>>>>>>> explain the other half of the processes though which still seem
>>>>>>> to be
>>>>>>> running happily despite the process owning the socket being dead for
>>>>>>> days.
>>>>>>>
>>>>>>> On 10/16/2012 11:18 AM, Ingela Anderton Andin wrote:
>>>>>>>> Hi!
>>>>>>>>
>>>>>>>> This sounds really strange it would be interesting to see all
>>>>>>>> process_info available for the process.
>>>>>>>>
>>>>>>>> Something you could try is to comment out the invocation of the
>>>>>>>> function
>>>>>>>> workaround_transport_delivery_problems in the terminate function
>>>>>>>> of the
>>>>>>>>   ssl_connection-process. This function can call recv(S, 0) and
>>>>>>>> sounds
>>>>>>>> like the probable recv that hangs even though it should not.
>>>>>>>>
>>>>>>>> Regards Ingela Erlang/OTP team - Ericsson AB
>>>>>>>>
>>>>>>>>
>>>>>>>> Loïc Hoguin wrote:
>>>>>>>>> On 10/15/2012 06:09 PM, Attila Rajmund Nohl wrote:
>>>>>>>>>> 2012/10/15 Loïc Hoguin <essen@REDACTED>:
>>>>>>>>>> [...]
>>>>>>>>>>>> lists:foldl(fun(X, Sum) -> case erlang:process_info(X) of
>>>>>>>>>>>> undefined ->
>>>>>>>>>>>> Sum; [{current_function, XXX}|_] -> case lists:keyfind(XXX, 1,
>>>>>>>>>>>> Sum)
>>>>>>>>>>>> of false
>>>>>>>>>>>> -> Curr = 0; {_, Curr} -> ok end, lists:keystore(XXX, 1, Sum,
>>>>>>>>>>>> {XXX,
>>>>>>>>>>>> Curr +
>>>>>>>>>>>> 1}) end end, [], List).
>>>>>>>>>>> [{{prim_inet,recv0,3},25856},{{gen_fsm,loop,7},26574}]
>>>>>>>>>>>
>>>>>>>>>>> Not sure which one is the ESTABLISHED list and which one is the
>>>>>>>>>>> FIN_WAIT2.
>>>>>>>>>>> Of course, I can't use sys:get_status/1 on the PIDs stuck in
>>>>>>>>>>> prim_inet:recv0/3 because the receive there is quite specific.
>>>>>>>>>>> So I
>>>>>>>>>>> can't
>>>>>>>>>>> get the stacktrace. The other case doesn't seem to give anything
>>>>>>>>>>> useful (for
>>>>>>>>>>> my level of knowledge, anyway).
>>>>>>>>>>
>>>>>>>>>> You can get the stacktrace with erlang:process_info(Pid,
>>>>>>>>>> backtrace).
>>>>>>>>>
>>>>>>>>> Thanks for the tip!
>>>>>>>>>
>>>>>>>>> So yeah, this one is stuck while trying to terminate.
>>>>>>>>>
>>>>>>>>> Program counter: 0x00007f05fd6a5608 (prim_inet:recv0/3 + 224)
>>>>>>>>> CP: 0x0000000000000000 (invalid)
>>>>>>>>> arity = 0
>>>>>>>>>
>>>>>>>>> 0x00007f052e1b1eb0 Return addr 0x00007f05a3248a98
>>>>>>>>> (ssl_connection:terminate/3 + 800)
>>>>>>>>> y(0)     57928
>>>>>>>>> y(1)     #Port<0.51824890>
>>>>>>>>>
>>>>>>>>> 0x00007f052e1b1ec8 Return addr 0x00007f05a3b29670
>>>>>>>>> (gen_fsm:terminate/7
>>>>>>>>> + 168)
>>>>>>>>> y(0)     []
>>>>>>>>> y(1)     []
>>>>>>>>> y(2)     []
>>>>>>>>> y(3)     []
>>>>>>>>> y(4)     #Port<0.51824890>
>>>>>>>>> y(5)     gen_tcp
>>>>>>>>>
>>>>>>>>> 0x00007f052e1b1f00 Return addr 0x00007f05a3bb41d0
>>>>>>>>> (proc_lib:init_p_do_apply/3 + 56)
>>>>>>>>> y(0)     []
>>>>>>>>> y(1)
>>>>>>>>> {state,server,{#Ref<0.0.8553.184512>,<0.18913.1670>},gen_tcp,tcp,tcp_closed,tcp_error,"localhost",8443,#Port<0.51824890>,{ssl_options,[],verify_none,{#Fun<ssl.1.54384637>,[]},false,false,undefined,1,"/home/obfuscated/obfuscatedrlang/lib/obfuscatedunchat-1.6.0/priv/ssl/cert.pem",undefined,"/home/obfuscated/obfuscatedrlang/lib/obfuscatedunchat-1.6.0/priv/ssl/cert.pem",undefined,undefined,undefined,"/home/obfuscated/obfuscatedrlang/lib/obfuscatedunchat-1.6.0/priv/ssl/cacert.pem",undefined,undefined,[<<2
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2
>>>>>>>>> bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2
>>>>>>>>> bytes>>,<<2 bytes>>,<<2
>>>>>>>>> bytes>>],#Fun<ssl.0.54384637>,true,268435456,false,[],undefined,false},{socket_options,binary,0,0,0,once},{connection_states,{connection_state,{security_parameters,<<2
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> bytes>>,0,7,1,16,128,16,unknown,2,20,0,<<48 bytes>>,<<32
>>>>>>>>> bytes>>,<<32
>>>>>>>>> bytes>>,undefined},undefined,{cipher_state,<<16 bytes>>,<<16
>>>>>>>>> bytes>>,undefined},<<20 bytes>>,6,true,<<12 bytes>>,<<12
>>>>>>>>> bytes>>},{connection_state,{security_parameters,undefined,0,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,<<32
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> bytes>>,undefined},undefined,undefined,undefined,undefined,true,undefined,undefined},{connection_state,{security_parameters,<<2
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> bytes>>,0,7,1,16,128,16,unknown,2,20,0,<<48 bytes>>,<<32
>>>>>>>>> bytes>>,<<32
>>>>>>>>> bytes>>,undefined},undefined,{cipher_state,<<16 bytes>>,<<16
>>>>>>>>> bytes>>,undefined},<<20 bytes>>,13,true,<<12 bytes>>,<<12
>>>>>>>>> bytes>>},{connection_state,{security_parameters,undefined,0,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,<<32
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> bytes>>,undefined},undefined,undefined,undefined,undefined,true,undefined,undefined}},[],<<0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> bytes>>,<<0 bytes>>,{<<0 bytes>>,<<0
>>>>>>>>> bytes>>},[],12308,{session,<<32
>>>>>>>>> bytes>>,undefined,<<1257 bytes>>,0,<<2 bytes>>,<<48
>>>>>>>>> bytes>>,true,63517514196},24599,ssl_session_cache,{3,1},undefined,false,rsa,undefined,{'RSAPrivateKey','two-prime',26952275589898844250103204854000460899755240864557148991279029405309749386179997816352098133722767607711339559172144166023544257819579600281768301281374192936110073288270279982220728800798557215504319561694659173220853716332492625335748497090542115981135939145850167689577529567336032326581411005966667046488818079418004669621093520594249388264789813277716494693460309931110183497107534349074298533842111855672958994036657571757894555006279600552417098362361837531438833518633912632305124934722467790401548511827982945839067677876435394001531838872958423949934302335970305331259903589271491819721745867851063767101711,65537,72894561061836369429952440001397404435381791098810666626550009339627280447692213756327103684977349565388024121690992163627770872919426951949943259565185707278720577541631553578110444175680062658469881781442213382568569223796242089823480188432
>>>>>>>>>
> 467
>>>>>>>>>
>>>>>>>>>
>>>> 00
>>>>>>>>>
>>>>>>>>>
>>>>>> 42
>>>>>>>>>
>>>>>>>>>
>>>>>>>> 51893513795
>>>>>>>>>
>>>>>>>>> 29032795180763714863835283712338222389782464852044908949559099608779063302276133782734662944988246892889439189879440128304200546026414115176004328650285319262737051106741309180479178565669060188052206153201268137707738579437817066853295089724557953207910831295502502266942720391639060038564028714207644340973116764361586980768047522941109290140269958017673,175687943987452481712482925881452602286249755596049611520304319723586191396077353211979651292293422727226888628810410185730059847194115171473543563960899974179642474882228165413574970142069071147678024461620581964093179873537217082000749319715442644574561869121959620883833054096268955094522609955548334417459,153409932282113553454184543331491108535630917003675738115171976005561046830080906444372373873384261619709596605477733172587334100633359814892598660495299494503111498422936185897687222105679379604880102069501875829087474529350923531274442007552106843129760936560325773227109973336264866827986047857553670280629,1653
>>>>>>>>>
> 778
>>>>>>>>>
>>>>>>>>>
>>>> 01
>>>>>>>>>
>>>>>>>>>
>>>>>> 12
>>>>>>>>>
>>>>>>>>>
>>>>>>>> 80640103960
>>>>>>>>>
>>>>>>>>> 32533996867303777118782862747708688208092956158440780252498542811490728487320772469810738971008968489145399289747151126453392805772011338882217952599871103786463875892748648418527046734444999723398379211505851743415571090610985953725319559986129544296274474768880307859595812559810466567,99372239273086723091338362047522171285756194037567212940251777246259022537354389739788150444373539745180764989177727522509078951433348961072685633082784597107680994416138776015512122203187528006871997391618377904030112283443023112892909533616156365176077807633229316646127723088806569214057154029767435353361,7323400945433254897172443884831966307350890604864415567101495881249115607826271555845873590083141172347084964771961315371798965570031721570889013199060717814694355367862624287469661994309269724835502396215672690851146331180228281917872372949595188776597872289445053324786304898777105863585382475713592590505,asn1_NOVALUE},{'DHParameter',17976931348623159077083915679378745319786029604
>>>>>>>>>
> 875
>>>>>>>>>
>>>>>>>>>
>>>> 60
>>>>>>>>>
>>>>>>>>>
>>>>>> 11
>>>>>>>>>
>>>>>>>>>
>>>>>>>> 70644442368
>>>>>>>>>
>>>>>>>>> 4197180216158519368947833795864925541502180565485980503646440548199239100050792877003355816639229553136239076508735759914822574862575007425302077447712589550957937778424442426617334727629299387668709205606050270810842907692932019128194467627007,2,asn1_NOVALUE},undefined,undefined,#Ref<0.0.0.13264>,{<0.1092.0>,#Ref<0.0.8553.179746>},0,<<0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> bytes>>,true,undefined,undefined,{[],[]},false,true}
>>>>>>>>> y(2)     connection
>>>>>>>>> y(3)     ssl_connection
>>>>>>>>> y(4) {'DOWN',#Ref<0.0.8553.184512>,process,<0.18913.1670>,normal}
>>>>>>>>> y(5)     <0.18199.1670>
>>>>>>>>> y(6)     normal
>>>>>>>>> y(7)     Catch 0x00007f05a3b29670 (gen_fsm:terminate/7 + 168)
>>>>>>>>>
>>>>>>>>> 0x00007f052e1b1f48 Return addr 0x0000000000883498 (<terminate
>>>>>>>>> process
>>>>>>>>> normally>)
>>>>>>>>> y(0)     Catch 0x00007f05a3bb41f0 (proc_lib:init_p_do_apply/3 +
>>>>>>>>> 88)
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> erlang-bugs mailing list
>>>> erlang-bugs@REDACTED
>>>> http://erlang.org/mailman/listinfo/erlang-bugs
>>>>
>>>
>>>
>>
>>
>


-- 
Loïc Hoguin
Erlang Cowboy
Nine Nines
http://ninenines.eu



More information about the erlang-bugs mailing list