[erlang-questions] qlc join query results

Mon Dec 8 04:13:57 CET 2008

Hi all,
    I'm hoping someone will be able to explain some interesting results
we're getting when using qlc.  Essentially, when we try to do what looks
like a relatively simple "join" style operation, we're getting different
results depending on the ordering of our generator terms, and we don't
understand why.  Here's some sample code that demonstrates the issue:
------------------------ post.erl ----------------------------
-module(post).                            
-export([start/0]).                       
-import(lists, [seq/2, map/2]).           
-import(io, [fwrite/1, fwrite/2]).        
-include_lib("stdlib/include/qlc.hrl").   

-record(r1, {a, b}).
-record(r2, {c, d}).

start() ->
        % Start up mnesia and set up two tables, one for each record type
        ok = mnesia:start(),
        {atomic, ok} = mnesia:create_table(r1
        , [{attributes, record_info(fields, r1)}]),
        {atomic, ok} = mnesia:create_table(r2
        , [{attributes, record_info(fields, r2)}]),

        % Populate tables with
          mnesia:dirty_write(r1, #r1{a=1, b=1}),
        map(
                fun (N) -> ok = mnesia:dirty_write(r2, #r2{c=N, d=1}) end
        , lists:seq(1, 100)
        ),

        Query = fun(N, Q) ->
                case mnesia:transaction(fun() -> qlc:e(Q) end) of
                        {atomic, L} -> fwrite("~s result = ~p
entries\n", [N, length(L)]);
                        Error -> fwrite("~s error: ~p\n", [N, Error])
                end
        end,sam

        Query("join by field r1, r2"
                , qlc:q([
                        {X#r1.b, A#r2.d}
                        || X <- mnesia:table(r1)
                        , A <- mnesia:table(r2)
                        , X#r1.b == A#r2.d
                ])),
        Query("join by field r2, r1"
                , qlc:q([
                        {X#r1.b, A#r2.d}
                        || A <- mnesia:table(r2)
                        , X <- mnesia:table(r1)
                        , X#r1.b == A#r2.d
                ])),
        Query("join by field r1, r2 nested"
                , qlc:q([
                        {X#r1.b, A#r2.d}
                        || X <- mnesia:table(r1)
                        , A <- mnesia:table(r2)
                        , X#r1.b == A#r2.d
                ], {join, nested_loop})),
        ok.
----------------------------------------------------------------

This produces the following output:

join by field r1, r2 result = 2 entries
join by field r2, r1 result = 100 entries
join by field r1, r2 nested result = 100 entries

The second and third lines are what we'd expect to get - the question
is, why do we get two results from the first query?  What exactly is it
about that ordering of generators that causes qlc to choose a different
join method and, in a more general sense, how can we predict which
method will be chosen?  Also, why exactly 2 results?  I can understand 1
or 100, but 2?

Thanks in advance,

Bernard