<html>
<head>
<meta content="text/html; charset=windows-1250"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hello,<br>
<br>
I think there is a problem with resource leak (memory) in
diameter_service module.<br>
<br>
This module is a gen_server, that state contains field watchdogT
:: ets:tid().<br>
This ets contains info about watchdogs.<br>
<br>
Diameter app service cfg is:<br>
<br>
[{'Origin-Host', HostName},<br>
{'Origin-Realm', Realm},<br>
{'Vendor-Id', ...},<br>
{'Product-Name', ...},<br>
{'Auth-Application-Id', [?DCCA_APP_ID]},<br>
{'Supported-Vendor-Id', [...]},<br>
{application, [{alias, diameterNode},<br>
{dictionary, dictionaryDCCA},<br>
{module, dccaCallback}]},<br>
{<b>restrict_connections, false</b>}]<br>
<br>
After start dimeter app, adding service and transport,
diameter_service state is:<br>
<br>
> diameter_service:state(diameterNode).<br>
#state{id = {1369,41606,329900},<br>
service_name = diameterNode,<br>
service = #diameter_service{pid = <0.1011.0>,<br>
capabilities =
#diameter_caps{...},<br>
applications =
[#diameter_app{...}]},<br>
watchdogT = 4194395,peerT = 4259932,shared_peers = 4325469,<br>
local_peers = 4391006,monitor = false,<br>
options = [{sequence,{0,32}},<br>
{share_peers,false},<br>
{use_shared_peers,false},<br>
{restrict_connections,false}]}<br>
<br>
and ets 4194395 has one record:<br>
<br>
> ets:tab2list(4194395).<br>
[#watchdog{pid = <0.1013.0>,type = accept,<br>
ref = #Ref<0.0.0.1696>,<br>
options = [{transport_module,diameter_tcp},<br>
{transport_config,[{reuseaddr,true},<br>
{ip,{0,0,0,0}},<br>
{port,4068}]},<br>
{capabilities_cb,[#Fun<diameterNode.acceptCER.2>]},<br>
{watchdog_timer,30000},<br>
{reconnect_timer,60000}],<br>
state = initial,<br>
started = {1369,41606,330086},<br>
peer = false}]<br>
<br>
Next I run very simple test using seagull symulator. Test
scenario is following:<br>
<br>
1. seagull: send CER<br>
2. seagull: recv CEA<br>
3. seagull: send CCR (init)<br>
4. seagull: recv CCA (init)<br>
5. seagull: send CCR (update)<br>
6. seagull: recv CCR (update)<br>
7. seagull: send CCR (terminate)<br>
8. seagull: recv CCA (terminate)<br>
<br>
Durring test there are two watchdogs in ets:<br>
<br>
> ets:tab2list(4194395).<br>
[#watchdog{pid = <0.1816.0>,type = accept,<br>
ref = #Ref<0.0.0.1696>,<br>
options = [{transport_module,diameter_tcp},<br>
{transport_config,[{reuseaddr,true},<br>
{ip,{0,0,0,0}},<br>
{port,4068}]},<br>
{capabilities_cb,[#Fun<diameterNode.acceptCER.2>]},<br>
{watchdog_timer,30000},<br>
{reconnect_timer,60000}],<br>
<b>state = initial</b>,<br>
started = {1369,41823,711370},<br>
peer = false},<br>
#watchdog{pid = <0.1013.0>,type = accept,<br>
ref = #Ref<0.0.0.1696>,<br>
options = [{transport_module,diameter_tcp},<br>
{transport_config,[{reuseaddr,true},<br>
{ip,{0,0,0,0}},<br>
{port,4068}]},<br>
{capabilities_cb,[#Fun<diameterNode.acceptCER.2>]},<br>
{watchdog_timer,30000},<br>
{reconnect_timer,60000}],<br>
<b>state = okay</b>,<br>
started = {1369,41606,330086},<br>
peer = <0.1014.0>}]<br>
<br>
After test but before tw timer elapsed, there is two watchdogs
also and this is ok:<br>
<br>
> ets:tab2list(4194395).<br>
[#watchdog{pid = <0.1816.0>,type = accept,<br>
ref = #Ref<0.0.0.1696>,<br>
options = [{transport_module,diameter_tcp},<br>
{transport_config,[{reuseaddr,true},<br>
{ip,{0,0,0,0}},<br>
{port,4068}]},<br>
{capabilities_cb,[#Fun<diameterNode.acceptCER.2>]},<br>
{watchdog_timer,30000},<br>
{reconnect_timer,60000}],<br>
<b> state = initial</b>,<br>
started = {1369,41823,711370},<br>
peer = false},<br>
#watchdog{pid = <b><0.1013.0></b>,type = accept,<br>
ref = #Ref<0.0.0.1696>,<br>
options = [{transport_module,diameter_tcp},<br>
{transport_config,[{reuseaddr,true},<br>
{ip,{0,0,0,0}},<br>
{port,4068}]},<br>
{capabilities_cb,[#Fun<diameterNode.acceptCER.2>]},<br>
{watchdog_timer,30000},<br>
{reconnect_timer,60000}],<br>
<b>state = down</b>,<br>
started = {1369,41606,330086},<br>
peer = <b><0.1014.0></b>}]<br>
<br>
But when tm timer elapsed transport and watchdog processes are
finished:<br>
<br>
> erlang:is_process_alive(list_to_pid("<b><0.1014.0></b>")).<br>
false<br>
> erlang:is_process_alive(list_to_pid("<b><0.1013.0></b>")).<br>
false<br>
<br>
and still two watchdogs are in ets:<br>
<br>
> ets:tab2list(4194395).<br>
[#watchdog{pid = <0.1816.0>,type = accept,<br>
ref = #Ref<0.0.0.1696>,<br>
options = [{transport_module,diameter_tcp},<br>
{transport_config,[{reuseaddr,true},<br>
{ip,{0,0,0,0}},<br>
{port,4068}]},<br>
{capabilities_cb,[#Fun<diameterNode.acceptCER.2>]},<br>
{watchdog_timer,30000},<br>
{reconnect_timer,60000}],<br>
state = initial,<br>
started = {1369,41823,711370},<br>
peer = false},<br>
#watchdog{pid = <0.1013.0>,type = accept,<br>
ref = #Ref<0.0.0.1696>,<br>
options = [{transport_module,diameter_tcp},<br>
{transport_config,[{reuseaddr,true},<br>
{ip,{0,0,0,0}},<br>
{port,4068}]},<br>
{capabilities_cb,[#Fun<diameterNode.acceptCER.2>]},<br>
{watchdog_timer,30000},<br>
{reconnect_timer,60000}],<br>
state = down,<br>
started = {1369,41606,330086},<br>
peer = <0.1014.0>}]<br>
<br>
I think watchdog <b><0.1013.0> </b>should be removed when
watchdog process is being finished.<br>
<br>
I run next test and now there are 3 watchdogs in ets:<br>
<br>
> ets:tab2list(4194395).<br>
[#watchdog{pid = <b><0.1816.0></b>,type = accept,<br>
ref = #Ref<0.0.0.1696>,<br>
options = [{transport_module,diameter_tcp},<br>
{transport_config,[{reuseaddr,true},<br>
{ip,{0,0,0,0}},<br>
{port,4068}]},<br>
{capabilities_cb,[#Fun<diameterNode.acceptCER.2>]},<br>
{watchdog_timer,30000},<br>
{reconnect_timer,60000}],<br>
<b>state = down</b>,<br>
started = {1369,41823,711370},<br>
peer = <b><0.1817.0></b>},<br>
#watchdog{pid = <0.1013.0>,type = accept,<br>
ref = #Ref<0.0.0.1696>,<br>
options = [{transport_module,diameter_tcp},<br>
{transport_config,[{reuseaddr,true},<br>
{ip,{0,0,0,0}},<br>
{port,4068}]},<br>
{capabilities_cb,[#Fun<diameterNode.acceptCER.2>]},<br>
{watchdog_timer,30000},<br>
{reconnect_timer,60000}],<br>
<b>state = down</b>,<br>
started = {1369,41606,330086},<br>
peer = <0.1014.0>},<br>
#watchdog{pid = <0.3533.0>,type = accept,<br>
ref = #Ref<0.0.0.1696>,<br>
options = [{transport_module,diameter_tcp},<br>
{transport_config,[{reuseaddr,true},<br>
{ip,{0,0,0,0}},<br>
{port,4068}]},<br>
{capabilities_cb,[#Fun<diameterNode.acceptCER.2>]},<br>
{watchdog_timer,30000},<br>
{reconnect_timer,60000}],<br>
<b>state = initial</b>,<br>
started = {1369,42342,845898},<br>
peer = false}]<br>
<br>
Watchdog and transport process are not alive:<br>
<br>
> erlang:is_process_alive(list_to_pid("<0.1816.0>")).<br>
false<br>
> erlang:is_process_alive(list_to_pid("<0.1817.0>")).<br>
false<br>
<br>
<br>
I suggest following change in code to correct this problem (file
diameter_service.erl):<br>
<br>
$ diff diameter_service.erl diameter_service.erl_ok<br>
1006c1006<br>
< connection_down(#watchdog{state = WS,<br>
---<br>
> connection_down(#watchdog{state = ?WD_OKAY,<br>
1015,1017c1015,1021<br>
< ?WD_OKAY == WS<br>
< andalso<br>
< connection_down(Wd, fetch(PeerT, TPid), S).<br>
---<br>
> connection_down(Wd, fetch(PeerT, TPid), S);<br>
><br>
> connection_down(#watchdog{},<br>
> To,<br>
> #state{})<br>
> when is_atom(To) -><br>
> ok.<br>
<br>
You can find this solution in attachement.<br>
<br>
Regards<br>
Aleksander Nycz<br>
<br>
</div>
<br>
<pre class="moz-signature" cols="72">--
Aleksander Nycz
Senior Software Engineer
Telco_021 BSS R&D
Comarch SA
Phone: +48 12 646 1216
Mobile: +48 691 464 275
website: <a class="moz-txt-link-abbreviated" href="http://www.comarch.pl">www.comarch.pl</a></pre>
</body>
</html>