[erlang-questions] discussion: mnesia table-specific options

Ulf Wiger <>
Tue Dec 21 20:54:23 CET 2010


Hi Matt,

I have an idea to implement a "transaction proxy" behavior in
mnesia, which could be a way to integrate external DBMSs
with transaction semantics; also perhaps a form of geographical
redundancy protocol. I've also thought that it might be a way to 
implement a caching table.

But it's possible that the access backend behaviour could be 
sufficient to build a cache on top of a disk-based table.

BR,
Ulf W

On 21 Dec 2010, at 20:21, Evans, Matthew wrote:

> While you are on an "mnesia Christmas wish list" Ulf, here is another feature that could be nice ;-)
> 
> I have a number of tables that are quite large, but access to those tables needs to be fast.
> 
> I know I could make a fragmented disc_copies version over multiple nodes to get speed and size, but this isn't always possible in my environment due to hardware limitations.
> 
> What I have done in these cases, but it would be nice to have in mnesia as a standard option, is to make use of temporal and special properties of data to maintain a small RAM cache of the table, along with the larger disc copy on disc only.
> 
> What we have noticed is that when an item is accessed once, it is likely to be accessed again within a short period of time. I then cache that object in a local (or distributed) ETS table when I read/write to/from disc. I can even set up associations (sort of external keys) and load up associated data in other tables (knowing that they will also be accessed soon). Of course, I need to ensure deletions and updates occur in both versions of the table.
> 
> Knowing when to flush from the cache can be a problem. LRU is one option, but this can be costly (need to maintain a table of access attempts). So I do a random flush of 10% of the records when the cache is full. I'll delete some "good" data, but more often than not I won't. Maybe having this ability baked into ets could help here.
> 
> The other issue is "select" or "match_object" needs to scan the entire disc version of the table, but one can ensure these operations are used infrequently.
> 
> I'm not sure how useful it would be to the general community, but when hardware is limited it could be of help.
> 
> Matt
> 
> 
> -----Original Message-----
> From:  [mailto:] On Behalf Of Ulf Wiger
> Sent: Monday, December 20, 2010 11:30 AM
> To: Morten Krogh
> Cc: 
> Subject: Re: [erlang-questions] discussion: mnesia table-specific options
> 
> 
> On 20 Dec 2010, at 16:56, Morten Krogh wrote:
> 
>> Hi
>> 
>> It sounds like very good work to make it easy to use all kinds of backends with mnesia.
>> 
>> I don't see that ram_copies, disk_copies, disk_only_copies is at a differernt level than the backend.
>> 
>> Some backends, as you say, can only operate on disk or in memory. (TC can be used in pure memory mode, I would claim, but that is a digression. TCMAP in tcutil.c).
> 
> You're right, of course, but I maintain that it can be a useful 
> distinction to keep the ram/disc/disc_only types as defined in mnesia,
> 
> The most important distinction between RAM and DISK storage is 
> that RAM-only storage (ram_copies) is not persistent, i.e. does not 
> survive a system restart. It is also expected to be fast, but to me, that's 
> a secondary consideration.
> 
> The special RAM+DISK combo (disc_copies) combines persistency with 
> fast lookup, but is, as a consequence, limited by available RAM.
> In the earliest versions of mnesia, the disk part was handled by dets tables,
> but dets was later replaced by disk_log. The only visible difference to the user
> was that log dumps became much faster.
> 
> DISK-only may (and usually does) employ any form of smart caching, but
> is not expected to be limited by available RAM. Any number of good 
> backends could be used instead of dets here, perfectly transparently to 
> the user.
> 
> Your suggestion (below) was more or less how external_copies worked,
> although the options for the backend were passed as user properties,
> which was something of a kludge.
> 
> BR,
> Ulf
> 
>> That will make the interface a bit strange.
>> 
>> What about this? You specify a list [{node, backend, optional options}]
>> 
>> So if I wanted two nodes node1, node2,
>> 
>> I could write
>> 
>> [{node1, ets}, {node1, tc, options}, {node2, dets}, {node2, mysql, options}]
>> 
>> And a very clear set of requirments could be made for new backends, e.g, they must present a get, put, erase etc. Then people could even create their own
>> backend very easily. A proplist or gb_tree could become a backend. A simple file storage could be a backend.
>> 
>> I could just write
>> 
>> module my_files_module
>> 
>> put(Key, Value) ->
>>    file:write(Key, term_to_binary(Value).
>> 
>> and similarly for get and erase,
>> 
>> and then plug it into mnesia
>> 
>> [{node1, my_files_module, Dir1}, {node2, my_files_module, Dir2}]
>> 
>> and then there would be a transactional way of saving files on two nodes using mnesia.
>> 
>> The choice of memory or disk would then be part of the backend and its options, not a separate level. Actually mnesia might not even understand what the backend is doing.
>> The backend could be a remote database, it could be a disk/ram hybrid.
>> 
>> Furthermore, what one could have was a write_only option for the backend. Then mnesia would only use put and erase for that backend and never issue a get. An append only log file could then be plugged in easily as a backend. It would just implement put as file append of {put, Key, Value}, and erase as an append of {erase, Key, Value}. But you could never query it, except after a crash which would be a special case.
>> 
>> For crash recovery the backends could present an iterator through all values.
>> 
>> Cheers,
>> 
>> Morten.
>> 
>> 
>> 
>> 
>> 
>> On 12/20/10 2:56 PM, Ulf Wiger wrote:
>>> Given that there are now several interesting performance options
>>> for ets, and a 64-bit dets version is (sort of) in the works, it seems
>>> a good time to consider how these things can be reflected in
>>> mnesia table definitions.
>>> 
>>> Some time ago, I introduced an 'external_copies' type in mnesia,
>>> and this was used (with some modifications) by mnesiaex to provide
>>> a TokyoCabinet back-end to mnesia. Thesis projects at Klarna have
>>> played around with CouchDB backends etc.
>>> 
>>> I think that conceptually, it would seem good to keep the
>>> ram_copies, disc_copies and disc_only_copies, regardless of
>>> back-end, since they address higher-level access characteristics
>>> (e.g. TokyoCabinet is, strictly speaking, also disc_only.)
>>> 
>>> A form of behaviour option could then be added that gives
>>> additional options - e.g. tuning parameters to InnoDB, dets, ets,
>>> or whatever back-end is being used.
>>> 
>>> Taking it one step further, it should be possible to specify a
>>> default behaviour for each copy type, and override per-table.
>>> 
>>> Comments?
>>> 
>>> BR,
>>> Ulf W
>>> 
>>> Ulf Wiger, CTO, Erlang Solutions, Ltd.
>>> http://erlang-solutions.com
>>> 
>>> 
>>> 
>>> 
>>> ________________________________________________________________
>>> erlang-questions (at) erlang.org mailing list.
>>> See http://www.erlang.org/faq.html
>>> To unsubscribe; mailto:
>>> 
>> 
>> 
>> ________________________________________________________________
>> erlang-questions (at) erlang.org mailing list.
>> See http://www.erlang.org/faq.html
>> To unsubscribe; mailto:
>> 
> 
> Ulf Wiger, CTO, Erlang Solutions, Ltd.
> http://erlang-solutions.com
> 
> 
> 
> 
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:
> 

Ulf Wiger, CTO, Erlang Solutions, Ltd.
http://erlang-solutions.com





More information about the erlang-questions mailing list