[erlang-questions] discussion: mnesia table-specific options

Evans, Matthew mevans@REDACTED
Tue Dec 21 20:21:35 CET 2010


While you are on an "mnesia Christmas wish list" Ulf, here is another feature that could be nice ;-)

I have a number of tables that are quite large, but access to those tables needs to be fast.

I know I could make a fragmented disc_copies version over multiple nodes to get speed and size, but this isn't always possible in my environment due to hardware limitations.

What I have done in these cases, but it would be nice to have in mnesia as a standard option, is to make use of temporal and special properties of data to maintain a small RAM cache of the table, along with the larger disc copy on disc only.

What we have noticed is that when an item is accessed once, it is likely to be accessed again within a short period of time. I then cache that object in a local (or distributed) ETS table when I read/write to/from disc. I can even set up associations (sort of external keys) and load up associated data in other tables (knowing that they will also be accessed soon). Of course, I need to ensure deletions and updates occur in both versions of the table.

Knowing when to flush from the cache can be a problem. LRU is one option, but this can be costly (need to maintain a table of access attempts). So I do a random flush of 10% of the records when the cache is full. I'll delete some "good" data, but more often than not I won't. Maybe having this ability baked into ets could help here.

The other issue is "select" or "match_object" needs to scan the entire disc version of the table, but one can ensure these operations are used infrequently.

I'm not sure how useful it would be to the general community, but when hardware is limited it could be of help.

Matt


-----Original Message-----
From: erlang-questions@REDACTED [mailto:erlang-questions@REDACTED] On Behalf Of Ulf Wiger
Sent: Monday, December 20, 2010 11:30 AM
To: Morten Krogh
Cc: erlang-questions@REDACTED
Subject: Re: [erlang-questions] discussion: mnesia table-specific options


On 20 Dec 2010, at 16:56, Morten Krogh wrote:

> Hi
> 
> It sounds like very good work to make it easy to use all kinds of backends with mnesia.
> 
> I don't see that ram_copies, disk_copies, disk_only_copies is at a differernt level than the backend.
> 
> Some backends, as you say, can only operate on disk or in memory. (TC can be used in pure memory mode, I would claim, but that is a digression. TCMAP in tcutil.c).

You're right, of course, but I maintain that it can be a useful 
distinction to keep the ram/disc/disc_only types as defined in mnesia,

The most important distinction between RAM and DISK storage is 
that RAM-only storage (ram_copies) is not persistent, i.e. does not 
survive a system restart. It is also expected to be fast, but to me, that's 
a secondary consideration.

The special RAM+DISK combo (disc_copies) combines persistency with 
fast lookup, but is, as a consequence, limited by available RAM.
In the earliest versions of mnesia, the disk part was handled by dets tables,
but dets was later replaced by disk_log. The only visible difference to the user
was that log dumps became much faster.

DISK-only may (and usually does) employ any form of smart caching, but
is not expected to be limited by available RAM. Any number of good 
backends could be used instead of dets here, perfectly transparently to 
the user.

Your suggestion (below) was more or less how external_copies worked,
although the options for the backend were passed as user properties,
which was something of a kludge.

BR,
Ulf

> That will make the interface a bit strange.
> 
> What about this? You specify a list [{node, backend, optional options}]
> 
> So if I wanted two nodes node1, node2,
> 
> I could write
> 
> [{node1, ets}, {node1, tc, options}, {node2, dets}, {node2, mysql, options}]
> 
> And a very clear set of requirments could be made for new backends, e.g, they must present a get, put, erase etc. Then people could even create their own
> backend very easily. A proplist or gb_tree could become a backend. A simple file storage could be a backend.
> 
> I could just write
> 
> module my_files_module
> 
> put(Key, Value) ->
>     file:write(Key, term_to_binary(Value).
> 
> and similarly for get and erase,
> 
> and then plug it into mnesia
> 
> [{node1, my_files_module, Dir1}, {node2, my_files_module, Dir2}]
> 
> and then there would be a transactional way of saving files on two nodes using mnesia.
> 
> The choice of memory or disk would then be part of the backend and its options, not a separate level. Actually mnesia might not even understand what the backend is doing.
> The backend could be a remote database, it could be a disk/ram hybrid.
> 
> Furthermore, what one could have was a write_only option for the backend. Then mnesia would only use put and erase for that backend and never issue a get. An append only log file could then be plugged in easily as a backend. It would just implement put as file append of {put, Key, Value}, and erase as an append of {erase, Key, Value}. But you could never query it, except after a crash which would be a special case.
> 
> For crash recovery the backends could present an iterator through all values.
> 
> Cheers,
> 
> Morten.
> 
> 
> 
> 
> 
> On 12/20/10 2:56 PM, Ulf Wiger wrote:
>> Given that there are now several interesting performance options
>> for ets, and a 64-bit dets version is (sort of) in the works, it seems
>> a good time to consider how these things can be reflected in
>> mnesia table definitions.
>> 
>> Some time ago, I introduced an 'external_copies' type in mnesia,
>> and this was used (with some modifications) by mnesiaex to provide
>> a TokyoCabinet back-end to mnesia. Thesis projects at Klarna have
>> played around with CouchDB backends etc.
>> 
>> I think that conceptually, it would seem good to keep the
>> ram_copies, disc_copies and disc_only_copies, regardless of
>> back-end, since they address higher-level access characteristics
>> (e.g. TokyoCabinet is, strictly speaking, also disc_only.)
>> 
>> A form of behaviour option could then be added that gives
>> additional options - e.g. tuning parameters to InnoDB, dets, ets,
>> or whatever back-end is being used.
>> 
>> Taking it one step further, it should be possible to specify a
>> default behaviour for each copy type, and override per-table.
>> 
>> Comments?
>> 
>> BR,
>> Ulf W
>> 
>> Ulf Wiger, CTO, Erlang Solutions, Ltd.
>> http://erlang-solutions.com
>> 
>> 
>> 
>> 
>> ________________________________________________________________
>> erlang-questions (at) erlang.org mailing list.
>> See http://www.erlang.org/faq.html
>> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>> 
> 
> 
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
> 

Ulf Wiger, CTO, Erlang Solutions, Ltd.
http://erlang-solutions.com




________________________________________________________________
erlang-questions (at) erlang.org mailing list.
See http://www.erlang.org/faq.html
To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED



More information about the erlang-questions mailing list