mnesia:match_object() on an ordered_set

Ulf Wiger etxuwig@REDACTED
Fri Apr 25 11:41:14 CEST 2003


On Thu, 24 Apr 2003, Hakan Mattsson wrote:

>On Thu, 24 Apr 2003, Ulf Wiger wrote:
>
>Uffe> Given the current implementation of mnesia:match_object(), I
>Uffe> don't think preserving the order would have to impact
>Uffe> performance, esp. not on large sets (and on small sets, I
>Uffe> don't think it matters.)
>
>For fragmented tables it would have a tremendous impact, as
>the order only is preserved within each fragment. In order
>to return an ordered result, it would require a quite
>substantial sort operation.

True, unless one writes a custom mnesia_frag_hash callback
which manages fragments with some global order. This is
what I had in mind to do with XML Query for Mnesia.

It does bring up the question how far one should try to
preserve the semantics. It would of course be nice if one
could guarantee the same semantics for fragmented tables as
for basic tables, but certainly not at any price (allowing
only the lowest common denominator is also a price to pay,
of course.)

>Of course Mnesia could ensure that the result of a
>match_object on an ordered_set always should be ordered,
>but is it worth the performance penalty?

Given the current implementation, I'd say yes. ;-)

This is the function in mnesia.erl that adds objects from
the transaction store to the found set:

add_match([], Objs, Type) ->
    Objs;
add_match([{Oid, _, delete}|R], Objs, Type) ->
    add_match(R, deloid(Oid, Objs), Type);
add_match([{Oid, Val, delete_object}|R], Objs, Type) ->
    add_match(R, lists:delete(Val, Objs), Type);
add_match([{Oid, Val, write}|R], Objs, Type) ->
    case Type of
        bag ->
            add_match(R, [Val | lists:delete(Val, Objs)], Type);
        _ ->
            add_match(R, [Val | deloid(Oid,Objs)],Type)
    end.

deloid(Oid, []) ->
    [];
deloid({Tab, Key}, [H | T]) when element(2, H) == Key ->
    deloid({Tab, Key}, T);
deloid(Oid, [H | T]) ->
    [H | deloid(Oid, T)].


Given that Type == ordered_set, Objs would be an ordered
list to start with, and sort_insert(Key, Val, Objs) would
not be slower than [Val | deloid(Oid,Objs)]. In fact, the
following sort_insert(Key,Val,Objs) is actually more than
twice as fast as the current implementation in mnesia.erl
(on a given sample with 1000 Objs, 10 written objs in
the lower key range, and 10 in the upper key range.)

sort_insert(Key, Val, [H|T]) when element(2,H) == Key ->
    [Val | T];
sort_insert(Key, Val, [H|_]=L) when element(2,H) > Key ->
    [Val | L];
sort_insert(Key, Val, [H|T]) ->
    [H | sort_insert(Key, Val, T)];
sort_insert(_, _, []) ->
    [].

The reason for this is mostly that deloid/2 is too generic.
It traverses the whole list each time, even after it's found
a matching key. Assuming set semantics, this is not
necessary. I fail to see why lists:keydelete/3 wouldn't do
the job.

The modified mnesia:add_match/3 would then become:

add_match([{{Tab,Key}, Val, write}|R], Objs, Type) ->
    case Type of
        bag ->
            add_match(R, [Val | lists:delete(Val, Objs)], Type);
        ordered_set ->
            add_match(R, sort_insert(Key,Val,Objs), Type);
        _ ->
            add_match(R, [Val |
                          lists:keydelete(Key,2,Objs)],Type)
    end.

Even with this fix, my sort_insert/3 is some 20% faster. (:
At the moment, I fail to see why on this particular test.

On a non-contiguous range of objects, where R contains new
records as well as updated ones, my sort_insert version is
much faster, as deloid/keydelete must traverse the entire
set for each new object.

/Uffe
-- 
Ulf Wiger, Senior Specialist,
   / / /   Architecture & Design of Carrier-Class Software
  / / /    Strategic Product & System Management
 / / /     Ericsson AB, Connectivity and Control Nodes





More information about the erlang-questions mailing list