[erlang-questions] Mnesia ets usage oddities.

Mon Apr 21 20:40:25 CEST 2014

All,

I'm currently debugging a system that uses mnesia as a data cache layer to
store the state of client processes. Each client process represents a
single TCP connection to the application, there may be thousands of
connections. When a connection is lost the client process is alerted and
the mnesia cache is updated. Similarly when a new connection is made a
client process is spawned and the mnesia cache is updated to reflect the
new connection.

Testing has shown that losing thousands of connections at once can cause
the application to crash due to ets table exhaustion. This makes perfect
sense, thousands of transactions being spawned in short order is likely to
exhaust the pool. What does not make sense is that the transactions spawned
by the client processes never trigger the system_limit error. That is
always triggered by a separate timer triggered transaction that just
happens to run while the client process transactions are processing.

Tracing calls to ets:new and printing out length(ets:all()) after each call
revealed that prior to crash our system reports using more ets tables than
is should be allowed to.

So a few questions,

1) How does ets determine if it can create a new table? length(ets:all())
returning 1800 when the system limit is 1400 is confusing.

2) Why could it be that only the timer triggered transaction crashes due to
ets table exhaustion? With the number of terminating connections I'd expect
to see a number of client processes terminating for this same reason. Log
verification shows all client processes terminated with the expected reason.

Any insights would be most appreciated.

Thanks,

Charles
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140421/55c9f864/attachment.htm>