mnesia power index
Ulf Wiger (AL/EAB)
ulf.wiger@REDACTED
Thu Feb 17 19:06:22 CET 2005
I thought I'd share a small hack I made to mnesia-4.2.
The purpose of the hack was to make room for more
flexible index functionality.
You can now create an index in the following manner:
mnesia:add_table_index(Tab, Pos :: integer() | atom());
mnesia:add_table_index(Tab, {{Pos,Tag}, M, F, Arg, IsOrdered}).
Pos : integer() | atom() The attribute position or name,
as with old-style indexes
Tag : atom() A user-friendly tag; {Pos,Tag}
uniquely identifies the index
M : atom() Module name
F : atom() Function name
Arg : term() Extra argument
IsOrdered : bool() true means an ordered index
With the new syntax, you can have several indexes on a given
attribute. A special case is if Pos == 1. Then the index works
on the whole object.
The index callback function is called like this:
M:F(Data, Arg) -> [IndexValue].
Data is either the value of the given attribute, or the whole
object, if Pos == 1. Multiple index values can be created for
each object -- e.g. when breaking up a string into whole words.
The old way works as before. You can also create both old and
new indexes via the create_table/2 function.
I've attached some modified mnesia files. In order to compile
them, you need to make sure the compiler can locate mnesia.hrl
(It's in mnesia-4.2/src/). It's 43K -- I hope the list server
doesn't think that's too much.
Functions like index_match_object() work only on old-style indexes,
and you can not have an ordered index on a disc_only table,
as dets doesn't support the ordered_set type.
I've added the following functions. In the following,
Index :: Pos | {Pos, Tag}; Pos :: atom() | integer():
- dirty_index_foldl(Fun,Acc,Tab,Index)
- dirty_index_foldr(Fun,Acc,Tab,Index)
- dirty_index_first(Tab, Index)
- dirty_index_next(Tab, Index, Key)
- dirty_index_prev(Tab, Index, Key)
- dirty_index_last(Tab, Index)
The first/next/last functions return {IndexValue, Objects}
The fold[lr] functions call Fun({IndexValue, ObjKey, [Object]}, Acc)
=:=:=:=:=:=:=:=:=:=:=
So, what can you do with this?
Well, lots. A few obvious uses are:
- index on whole words
- index on word stems (we'll try to demo this soon)
- convert strings to lower case
- index on attribute combinations (compound indexes)
- perhaps even redo the snmp hook so that it
doesn't have to be a special hack, requiring
the primary key to be structured in a special way.
Below are some examples.
Please take it for a spin, and let me know what you think.
Please note that I have no authority to put anything into
mnesia, so if you like this stuff, you can help lobby for it.
Regards,
Uffe
%%%%% First, a callback module with indexing functions:
-module(test).
-export([words/2, name/2]).
-import(httpd_util, [to_lower/1]).
words(Str, []) ->
string:tokens(Str, " \t\n");
words(Str, locase) ->
[to_lower(W) || W <- string:tokens(Str, " \t\n")].
name(Obj, [locase,FN,LN]) ->
FirstName = element(FN, Obj),
LastName = element(LN, Obj),
[{to_lower(LastName), to_lower(FirstName)}].
%%%% Then, some shell interaction:
=PROGRESS REPORT==== 17-Feb-2005::18:49:38 ===
application: mnesia
started_at: nonode@REDACTED
ok
** First, a simple index that splits a string into words:
3> mnesia:create_table(test,[{attributes,[key,ref,data]}]).
{atomic,ok}
4> mnesia:add_table_index(test,{{data,words},test,words,[],true}).
{atomic,ok}
5> mnesia:dirty_write({test,"uffe",ref1,"uffes words"}).
ok
6> mnesia:dirty_write({test,"hans",ref1,"hanses words"}).
ok
7> mnesia:dirty_write({test,"per",ref2,"pers word"}).
ok
8> mnesia:dirty_index_read(test,"words",{data,words}).
[{test,"hans",ref1,"hanses words"},{test,"uffe",ref1,"uffes words"}]
9> mnesia:dirty_index_read(test,"word",{data,words}).
[{test,"per",ref2,"pers word"}]
** Just making sure that old indexes still work:
10> mnesia:add_table_index(test,ref).
{atomic,ok}
11> mnesia:dirty_index_read(test,ref1,ref).
[{test,"hans",ref1,"hanses words"},{test,"uffe",ref1,"uffes words"}]
12> mnesia:dirty_index_read(test,ref2,ref).
[{test,"per",ref2,"pers word"}]
** Verifying that you can also delete indexes:
13> mnesia:del_table_index(test, {data,words}).
{atomic,ok}
14> mnesia:del_table_index(test, ref).
{atomic,ok}
15>
15>
** A bag table. These are tricky because you must filter
** out objects with the same key, but where the index fun
** doesn't produce a matching index value:
15> mnesia:create_table(testbag,[{type,bag},{attributes,[key,ref,data]}]).
{atomic,ok}
16> mnesia:add_table_index(testbag,{{data,words},test,words,[],true}).
{atomic,ok}
17> mnesia:dirty_write({testbag,uffe,ref1,"uffes words"}).
ok
18> mnesia:dirty_write({testbag,hans,ref1,"hanses words"}).
ok
19> mnesia:dirty_write({testbag,uffe,ref2,"pers word"}).
ok
20> mnesia:dirty_index_read(testbag,"words",{data,words}).
[{testbag,hans,ref1,"hanses words"},{testbag,uffe,ref1,"uffes words"}]
21> mnesia:dirty_index_read(testbag,"word",{data,words}).
[{testbag,uffe,ref2,"pers word"}]
22>
22>
** Another small example, showing how to do case-insensitive
** index lookups, unordered index:
22> mnesia:create_table(test3,[{attributes,[key,data]}]).
{atomic,ok}
23> mnesia:add_table_index(test3,{{data,words},test,words,locase,false}).
{atomic,ok}
24> mnesia:dirty_write({test3,1,"The Quick Brown Fox"}).
ok
25> mnesia:dirty_write({test3,2,"the quick brown fox"}).
ok
26> mnesia:dirty_write({test3,3,"JUMPS OVER THE LAZY DOG"}).
ok
27> mnesia:dirty_write({test3,4,"jumps over the lazy dog"}).
ok
28> mnesia:dirty_index_read(test3,"fox",{data,words}).
[{test3,2,"the quick brown fox"},{test3,1,"The Quick Brown Fox"}]
29> mnesia:dirty_index_read(test3,"the",{data,words}).
[{test3,4,"jumps over the lazy dog"},
{test3,3,"JUMPS OVER THE LAZY DOG"},
{test3,2,"the quick brown fox"},
{test3,1,"The Quick Brown Fox"}]
30> mnesia:dirty_index_read(test3,"dog",{data,words}).
[{test3,4,"jumps over the lazy dog"},{test3,3,"JUMPS OVER THE LAZY DOG"}]
31>
31>
** An example showing a compound case-insensitive, ordered index:
31> mnesia:create_table(test4,[{attributes,[key,firstname,lastname,data]}]).
{atomic,ok}
32> mnesia:add_table_index(test4,{{1,name},test,name,[locase,3,4],true}).
{atomic,ok}
33> mnesia:dirty_write({test4,1,"Ulf","Wiger","The Quick Brown Fox"}).
ok
34> mnesia:dirty_write({test4,2,"Joe", "Armstrong","the quick brown fox"}).
ok
35> mnesia:dirty_write({test4,3,"ulf", "wiger", "JUMPS OVER THE LAZY DOG"}).
ok
36> mnesia:dirty_write({test4,4,"joe", "armstrong", "jumps over the lazy dog"}).
ok
37> mnesia:dirty_index_read(test4,{"wiger","ulf"},{1,name}).
[{test4,1,"Ulf","Wiger","The Quick Brown Fox"},
{test4,3,"ulf","wiger","JUMPS OVER THE LAZY DOG"}]
38> mnesia:dirty_index_read(test4,{"armstrong","joe"},{1,name}).
[{test4,2,"Joe","Armstrong","the quick brown fox"},
{test4,4,"joe","armstrong","jumps over the lazy dog"}]
** Let's try the fold and iterator functions:
39> mnesia:dirty_index_foldr(fun({IdxKey,ObjKey,Objs} =X, Acc) -> [X|Acc] end, [], test4, {1,name}).
[{{"armstrong","joe"},2,[{test4,2,"Joe","Armstrong","the quick brown fox"}]},
{{"armstrong","joe"},
4,
[{test4,4,"joe","armstrong","jumps over the lazy dog"}]},
{{"wiger","ulf"},1,[{test4,1,"Ulf","Wiger","The Quick Brown Fox"}]},
{{"wiger","ulf"},3,[{test4,3,"ulf","wiger","JUMPS OVER THE LAZY DOG"}]}]
** Oops! The following functions don't work right!
40> mnesia:dirty_index_first(test4,{1,name}).
{{{"armstrong","joe"},2},[]}
41> mnesia:dirty_index_next(test4,{1,name},{{"armstrong","joe"},2}).
{{{"armstrong","joe"},4},[]}
** They do work with old-style indexes, and should work with
** unordered indexes. I will fix this.
42> mnesia:add_table_index(test,ref). {atomic,ok}
43> mnesia:dirty_index_first(test,ref).
{ref1,[{test,"hans",ref1,"hanses words"},{test,"uffe",ref1,"uffes words"}]}
44> mnesia:dirty_index_next(test,ref,ref1).
{ref2,[{test,"per",ref2,"pers word"}]}
45>
<<mnesia_power_index.tgz>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mnesia_power_index.tgz
Type: application/x-compressed
Size: 43811 bytes
Desc: mnesia_power_index.tgz
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20050217/02ff0054/attachment.bin>
More information about the erlang-questions
mailing list