<div dir="ltr">All,<div><br></div><div>I'm currently debugging a system that uses mnesia as a data cache layer to store the state of client processes. Each client process represents a single TCP connection to the application, there may be thousands of connections. When a connection is lost the client process is alerted and the mnesia cache is updated. Similarly when a new connection is made a client process is spawned and the mnesia cache is updated to reflect the new connection.</div>
<div><br></div><div>Testing has shown that losing thousands of connections at once can cause the application to crash due to ets table exhaustion. This makes perfect sense, thousands of transactions being spawned in short order is likely to exhaust the pool. What does not make sense is that the transactions spawned by the client processes never trigger the system_limit error. That is always triggered by a separate timer triggered transaction that just happens to run while the client process transactions are processing.</div>
<div><br></div><div>Tracing calls to ets:new and printing out length(ets:all()) after each call revealed that prior to crash our system reports using more ets tables than is should be allowed to. </div><div><br></div><div>
So a few questions,</div><div><br></div><div>1) How does ets determine if it can create a new table? length(ets:all()) returning 1800 when the system limit is 1400 is confusing.</div><div><br></div><div>2) Why could it be that only the timer triggered transaction crashes due to ets table exhaustion? With the number of terminating connections I'd expect to see a number of client processes terminating for this same reason. Log verification shows all client processes terminated with the expected reason.</div>
<div><br></div><div>Any insights would be most appreciated.</div><div><br></div><div>Thanks,</div><div><br></div><div>Charles</div></div>