mnesia, large datasets and key range lookups

Ulf Wiger ulf.wiger@REDACTED
Thu Jul 15 10:15:29 CEST 2004


On Tue, 13 Jul 2004 02:58:11 -0400, Shawn Pearce <spearce@REDACTED> 
wrote:

> This is where I wish mnesia had compound key indexes.  My suggestion
> is pretty simple.  Add another attribute for the key and use that:
>
>   -record(time_parameter, {key, time, parameter, value}).
>
>   create_tables() ->
>      mnesia:create_table(time_parameter,[{type, set},
>                          {disc_only_copies, [node()]},
>                          {attributes, record_info(fields, 
> time_parameter)}]).
>
>   make_key(#time_parameter{time=T, parameter=P}) ->
>      {T, P}.
>
>   add_sample(V) ->
>      V2 = V#time_parameter{ key = make_key(V) },
>      mensia:write(V2).
>
> Then match becomes:
>
>   dirty_para_request(Parameter, StartTime, EndTime) ->
>       Sel = [{#time_parameter{parameter=Parameter,
>                               key='$1', value='_'},
>               [{'<',{const,{StartTime,Parameter}},'$1'},
>                {'=<','$1',{const,{EndTime,Parameter}}}],
>               ['$_']}],
>       mnesia:dirty_select(time_parameter, Sel).
>
>
> Hopefully this will make mnesia use a key range scan.  But note the
> above is untested.  :)

For mnesia to perform the above search smartly, you'd have to use
an 'ordered_set' table. Then, specifying the first part of the key,
and leaving the rest unspecified works very well.

One problem is that dets doesn't support ordered set, so you'd also
have to change it to a disc_copy table -- this, in it's turn -- might
be problematic given the size of the data set.

One thing you could play around with is to use fragmented tables, with
one fragment per day, and a customized frag_hash algorithm, selecting
fragment based on the date. This will allow you to keep a single-table
view, while being able to narrow your searches based on time.

Check the reference manual for mnesia_frag_hash to find out how to use
the callback behaviour for custom fragmentation policies. It's not
that difficult. There is also some sample code to look at in the
mnesia source (apologies for not just writing down the solution, but
I have some children here calling for my attention.)

http://www.erlang.org/doc/r9c/lib/mnesia-4.1.4/doc/html/mnesia_frag_hash.html

/Uffe
-- 
Ulf Wiger




More information about the erlang-questions mailing list