mnesia ram cache

Evans, Matthew mevans@REDACTED
Wed Apr 21 23:33:59 CEST 2010


Hi,

I am about to implement an mnesia database for a project that is expected the database to grow into the millions of records. I will be using fragmented tables for this, so am confident that the only space limit will be what the disc has available.

Due to the immense size of the database I will have to create the table as disc_only_copies since we do not have anywhere enough RAM available for disc_copies/ram_copies. It goes without saying that lookup (and write times) will be severely hit by this limit.

I need to support in excess of 300,000 lookups per second, which I know that ETS can easily manage with enough room to spare.

To implement this I intend to write my own ETS cache on top of mnesia, and use table subscription to the mnesia table (mnesia:subscribe/1) so that the cache can be updated when inserts and updates occur to the master mnesia database. The cache shall be managed by a gen_server that contains LRU or similar rules to purge data to avoid it growing too much.

This is fine, but got me wondering if it would be appropriate for mnesia to implement similar features?

Perhaps the addition of a new set of configuration options:

[{ram_cache, NodeList}, {ram_cache_max_size,SizeInBytes}, {ram_cache_max_records, SizeInRecords}, {ram_cache_purge_fun, Fun()}]

Obviously ram_cache will setup the cache on the list of NodeList.

ram_cache_max_size / ram_cache_max_records will be the limit causing the cache to be purged, and ram_cache_purge_fun will be a Fun() that provides a set of rules to manage the purge.

For example:

fun() -> [ets:delete(my_table_ram_cache,Rec) || [{Rec,Stuff,Inserted} <- ets:tab2list(my_table_ram_cache), timer:now_diff(erlang:now(),Inserted) > 10000000] end.

Obviously in the example above I am assuming the cache is an ETS table, maybe some better abstraction is desirable.

Is such a feature on the roadmap anywhere?


Regards

Matt


More information about the erlang-questions mailing list