locked up system using :ets.match_object

Fri Jan 17 20:19:08 CET 2020

 Table parameters are ordered_set, concurrent read and write.
    On Friday, January 17, 2020, 01:10:17 p.m. EST, Led <ledest@REDACTED> wrote:  

I am having some performance trouble in a system that does a few queries on a small ets table of around 10,000 records.

Basically with around 500 concurrent processes, everything is fine, 1500 I start to notice some small degradation, at around 3000 concurrent processes the schedulers grind to a halt, TOP system CPU usage is around 50%, but Erlang scheduler usage (scheduler:utilization) is 100% and capped out on all 40 threads.

I am guessing the schedulers are all waiting on locks on the ets table.  I thought match_object and ets was quite optimized these days, using R22, I am wondering if there is some synchronization/locking issues that could be addressed.  Because I mean at 3000 processes maybe hitting that table 10 times per second on average, does not seem like much. 30k match_objects per second, with ongoing inserts. 

Also would there be a way to debug/pinpoint this is the exact issue?  I just did A/B testing where I turned off parts of the system, when I turned off the part that does the match_objects on the ETS table, the system ran fine and never deadlocked at 100% scheduler usage.  Its also hard to profile, as the system is so locked up the profiler barely runs.

For now it seems the solution is to rework the architecture and put a second cached view ETS table, so the match_objects can be replaced with key lookups.  Which gets filled by a single process running that pulls via match_object from the main table and fills the cache.

You didn't specify parameters of your table.

-- Led.  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20200117/a2796dc4/attachment.htm>