[erlang-questions] Mnesia record with composite key and QLC questions

Steve Davis <>
Mon May 18 19:47:21 CEST 2009


Hi Leon,

These are higher level database design questions, so the short answer
is that "it depends on your application".

Here's a few considerations that I use based on my own experience.
Many others here have greater experience and my provide further
insights..

One of the great things about using mnesia is that the "data
impedance" is zero (tuples are stored as tuples etc). Full data
normalization (i.e. treatng mnesia as if it were a traditional RDBMS)
will probably not give you the best result - as it will likely
introduce a large number of joins to get the results back.

It's probably best to start with records (tuples) that make sense
inside your application code. If however you find you are doing "deep
queries" on your data, you may want to split out certain structures
into a separate tables. This may also be necessary with data persisted
where a "monolithic" table could exceed 3GB (on 32-bit) or 4GB (on 64-
bit) during its lifetime. I believe, but I'm not 1000% sure that this
storage limit doesn't apply to ram_copies but only to disk_copies.
Perhaps someone can confirm this as I've never actually tested the
limit as yet.

As with anything, the best advice I have had is to code it cleanly
first, then measure the application under as close to a real
deployment laod as is possible to see what optimizations are required/
make most sense.

regards,
Steve


On May 18, 8:01 am, Leon de Rooij <> wrote:
> Hi Steve,
>
> Thanks, I didn't know that I could extract variables from each record  
> (on the left of the <- ) and use them in the search conditions on the  
> right.. Very nice :-)
>
> I've also found that I  can do:
>
> get_users_by_domain(Domain) ->
>          mnesia:dirty_match_object({user, {Domain, '_'}, '_', '_',  
> '_', '_'}).
>
> It's perhaps indeed nicer to have relations spread over several tables  
> like you suggest, but where should I stop doing that ? I mean, each  
> user is for example in one or more groups. Then it would seem nice to  
> have a list called Groups in the user record which just contains some  
> group names, instead of having a separate table with foreign keys to  
> the guid's of users together with a group name.. Perhaps it depends on  
> whether the data will be used to search on, written often, or whether  
> the data will only be returned ?
>
> Anyway, I now want to try putting a lot of records in the db ( a  
> couple of million ? ) and see what the speed difference is comparing  
> both methods.
>
> thanks again!
>
> Kind regards,
>
> Leon
>
> On May 16, 2009, at 3:31 PM, Steve Davis wrote:
>
>
>
>
>
> > Hi Leon,
>
> > Try something like...
>
> > find_by_domain(Domain) ->
> >    F = fun() ->
> >            qlc:e(qlc:q([U || U = {user, {D, _}, _} <- mnesia:table(user), D =:=
> > Domain]))
> >    end,
> >    mnesia:transaction(F).
>
> > ...though I would suggest that you reconsider your PK here. Primary
> > keys should really be "opaque".
>
> > Perhaps a better solution is to generate a GUID for each user then use
> > that as the primary key, and then make their domain and username
> > attributes of that.
> > e.g.
> > -record(user, {guid, username, domain}}.
>
> > ....which will make this and other searches much more sane.
>
> > As for the K/V pairs (profile data?), again maybe think about a single
> > table referencing each K/V pair to the user by guid then you can
> > collect their profile properties and search for users with particular
> > properties by the property key.
>
> > Obviously you will need to index certain fields as appropriate to your
> > app for reasonable performance.
>
> > Regards,
> > Steve
>
> > On May 15, 3:32 am, Leon de Rooij <> wrote:
> >> Hi all,
>
> >> I want to store users in an mnesia database, but want the key not  
> >> only
> >> to be unique on username, but also on domain.
>
> >> So I defined the user record as follows:
>
> >> -record(user, {domain_username, params=[]})
>
> >> where domain_username = {Domain, Username}
> >> and params = [ {Key1,Value1}, {Key2, Value2}, ... ]
>
> >> I chose params to be a list because I don't know beforehand how many
> >> key/value tuples will be in there.
>
> >> Is this the correct way to do things ? (Instead of having domain and
> >> username as standalone strings (lists) in the record and make it a  
> >> bag
> >> instead of a set, or for the params having a seperate table with
> >> relations/foreign keys, like I would do with SQL ?)
>
> >> It's also not clear to me what is the best way to search through the
> >> records:
>
> >> To find a user based on a param, the following function works:
>
> >> Param = {"accountcode", "1234"},
> >> Fun = fun() -> qlc:eval(qlc:q([ U || U <- mnesia:table(user),
> >> lists:member(Param, U#user.params) ])) end,
> >> mnesia:transaction(Fun).
>
> >> That works, but can this be done more efficient ? (Searching on  
> >> params
> >> won't be done often, but still I'd like to know..)
>
> >> Also, I'd like to be able to for example get a list of all users in a
> >> certain domain, but I can't get it to work yet..
>
> >> I tried this:
>
> >> Domain = "test.com",
> >> Fun = fun() -> qlc:eval(qlc:q([ U || U <- mnesia:table(user),
> >> {Domain,_} =:= U#user.domain_username ])) end,
> >> mnesia:transaction(Fun).
>
> >> But, having an underscore for username in the tuple to match equality
> >> is not allowed (compilation breaks with "variable '_' is unbound").
>
> >> Then I tried using assignment operator there:
>
> >> Fun = fun() -> qlc:eval(qlc:q([ U || U <- mnesia:table(user),
> >> {Domain,_} = U#user.domain_username ])) end,
>
> >> which does compile, but it breaks at runtime with:
>
> >> =ERROR REPORT==== 15-May-2009::08:26:37 ===
> >> Error in process <0.31.0> with exit value: {undef,
> >> [{test,get_user_by_domain,["test.com"]},{erl_eval,do_apply,5},
> >> {shell,exprs,6},{shell,eval_loop,3}]}
>
> >> ** exited: {undef,[{test,get_user_by_domain,["test.com"]},
> >>                     {erl_eval,do_apply,5},
> >>                     {shell,exprs,6},
> >>                     {shell,eval_loop,3}]} **
>
> >> Can anyone tell me what is best practice in these cases ?
>
> >> Thanks and kind regards,
>
> >> Leon
> >> _______________________________________________
> >> erlang-questions mailing list
> >> ://www.erlang.org/mailman/listinfo/erlang-questions
> > _______________________________________________
> > erlang-questions mailing list
> > 
> >http://www.erlang.org/mailman/listinfo/erlang-questions
>
> _______________________________________________
> erlang-questions mailing list
> ://www.erlang.org/mailman/listinfo/erlang-questions



More information about the erlang-questions mailing list