[erlang-bugs] R18 Unbounded SSL Session ETS Table Growth

Stefan Grundmann sg2342@REDACTED
Wed Sep 2 20:47:34 CEST 2015


hi,

i think, that the default ssl session cache callback module
is unsuitable for erlang TLS servers that see connections from lots
of clients.

In use cases where session reuse is not needed: 

  {session_cb,null_ssl_session_cb}

in ssl app environment and

  -module_null_ssl_session_cb).
  -behaviour(ssl_session_cache_api).
  -export([delete/2,foldl/3,init/1,lookup/2,
           select_session/2,terminate/1,update/3]).
  init(_) -> invalid.
  terminate(_) -> ok.
  lookup(_,_) -> undefined
  update(_,_,_) -> ok
  delete(_,_) -> ok
  foldl(_,Acc,_) -> Acc.
  select_session(_,_) -> [].

provides an easy workaround.

best regards

Stefan Grundmann

On Wed, Sep 02, 2015 at 03:15:33PM +0100, Ben Murphy wrote:
> I've seen in production that the ssl_session_cache ETS table can
> become very large which will start to cause new SSL connections to
> take > 5 seconds to establish. The root cause of this is that multiple
> SSL sessions are stored for a particular SSL connection configuration
> even though only 1 (the most recent) is needed.
> 
> So the ETS table is keyed by {Host, Port, SessionID} but there a bunch
> of other parameters that need to match for a session to be resumed for
> example the client certificate and compression algorithm also need to
> match. So what the current code does is create a new entry into the
> table for each connection (even if session reuse is not enabled!!) and
> then when you create a new connection it will iterate through all the
> matching sessions for that {Host,Port} and check that the other
> parameters match.
> 
> Looking at the code sessions are only removed from this table when a
> lifetime is reached which is configurable but defaults to 24 hours or
> if a FATAL error happens on a connection with that ID.
> 
> In the pathological case where a server supplies a session ID but
> never supports resuming it this causes the session table to grow at
> the same rate as new connections are established. This makes
> establishing N connections take O(N^2) work. Also, in the case when
> {reuse_sessions, false} has been supplied the session should not be
> added to the table because a new entry will be added to the table
> every time and will only be removed after 24 hours.
> 
> We've witnessed the catastrophic slow down occur when making a
> requests against a server that normally resumes sessions properly. I
> suspect this is because a) it started failing to resume sessions
> because of some failure on their side or b) it's session lifetime was
> considerably less than 24 hours and erlang started to try to resume
> failed sessions and continuously created new sessions because of this.
> I think it is also important to note that while the
> register_unique_session fix would fix the memory leak if it worked in
> this situation it would cause a new session to be created each time
> and make ssl session caching pointless until the erlang session
> expired. I think it would be preferable to create a new session and
> delete the old one to preserve the uniqueness but I'm not sure how
> this could be done ETS without creating a race that would generate
> multiple sessions. The other alternative would be to delete sessions
> that are known to not resume. For example if you try to resume a
> session and the server no longer knows about it this is known by the
> client because the client has to go through the whole handshake.
> 
> I think this was meant to be fixed in by register_unique_session
> fucntion but the fix does not work because it assumes the return value
> of select_session is [#session{}] when it is really [ [binary(),
> #session{}] ].
> 
> https://github.com/erlang/otp/blob/maint/lib/ssl/src/ssl_manager.erl#L564
> 
> This is an example of the broken behaviour with reuse_sessions: false
> (should work on R16B02 and R18).
> 
> 1>  application:ensure_all_started(ssl).
> {ok,[crypto,asn1,public_key,ssl]}
> 2> ets:info(element(2, sys:get_state(whereis(ssl_manager)))).
> [{compressed,false},
>  {memory,107},
>  {owner,<0.45.0>},
>  {heir,none},
>  {name,ssl_otp_session_cache},
>  {size,0},
>  {node,nonode@REDACTED},
>  {named_table,false},
>  {type,ordered_set},
>  {keypos,1},
>  {protection,protected}]
> 3> ssl:close(element(2, ssl:connect("google.com", 443,
> [{reuse_sessions, false}]))).
> ok
> 4> ssl:close(element(2, ssl:connect("google.com", 443,
> [{reuse_sessions, false}]))).
> ok
> 5> ssl:close(element(2, ssl:connect("google.com", 443,
> [{reuse_sessions, false}]))).
> ok
> 6> ssl:close(element(2, ssl:connect("google.com", 443,
> [{reuse_sessions, false}]))).
> ok
> 7> ssl:close(element(2, ssl:connect("google.com", 443,
> [{reuse_sessions, false}]))).
> ok
> 8> ssl:close(element(2, ssl:connect("google.com", 443,
> [{reuse_sessions, false}]))).
> ok
> 9> ssl:close(element(2, ssl:connect("google.com", 443,
> [{reuse_sessions, false}]))).
> ok
> 10> ssl:close(element(2, ssl:connect("google.com", 443,
> [{reuse_sessions, false}]))).
> ok
> 11> ssl:close(element(2, ssl:connect("google.com", 443,
> [{reuse_sessions, false}]))).
> ok
> 12> ets:info(element(2, sys:get_state(whereis(ssl_manager)))).
> [{compressed,false},
>  {memory,881},
>  {owner,<0.45.0>},
>  {heir,none},
>  {name,ssl_otp_session_cache},
>  {size,9},
>  {node,nonode@REDACTED},
>  {named_table,false},
>  {type,ordered_set},
>  {keypos,1},
>  {protection,protected}]
> _______________________________________________
> erlang-bugs mailing list
> erlang-bugs@REDACTED
> http://erlang.org/mailman/listinfo/erlang-bugs



More information about the erlang-bugs mailing list