Performance of mnesia:select/2

Sverker Eriksson sverker.eriksson@REDACTED
Fri Feb 26 18:49:01 CET 2021


One thing to be aware of when doing benchmarks on ETS set vs ordered_set is 
that ordered_set may get boosted cache performance if the keys are accessed in 
term order. The same internal tree nodes are visited over and over again as 
term adjacent key are most often close to each other in the tree.

So if that is not your typical traffic case, the benchmark should access the 
keys in some random order.



/Sverker





From: erlang-questions <erlang-questions-bounces@REDACTED> On Behalf Of 
Jacob
Sent: den 26 februari 2021 16:35
To: Dan Gudmundsson <dangud@REDACTED>
Cc: Questions erlang-questions <erlang-questions@REDACTED>
Subject: Re: Performance of mnesia:select/2





On 2/26/21 3:53 PM, Dan Gudmundsson wrote:

Interesting, and times do you get for 500 ets lookup on that data?

Oh, I really should have tested that as well:

timer:tc(fun () -> lists:map(fun(K) -> ets:lookup(t, K) end, Keys) end).

* using lookups: [set, {keypos, 2}]:               3148 us
* using lookup: [ordered_set, {keypos, 2}]:  3423 us

I have repeated the select/pattern runs to have comparable results:

* using pattern, [ordered_set, {keypos, 2}]: 3173 us
* using pattern, [set, {keypos, 2}]:               8091 us

Note that there was quite a high jitter on the measurements so I have compute 
the mean value of 3 measurements each.

So lookups and pattern/ordered_set were too close to really come to a winner 
here. I had the impression that the variance was slightly higher with lookup, 
but digging into this would require a better measurement setup than what I 
have been using.





On Fri, Feb 26, 2021 at 3:12 PM Jacob <jacob01@REDACTED 
<mailto:jacob01@REDACTED> > wrote:

Hi,

assuming that the match spec compiler does clever things with patterns,
I'd use select with the following

   MatchExpression = [ {{'_', K, '_'}, [], ['$_']} || K <- Keys ]

If did some quick measurements with plain ETS, timer:tc and 500 keys out
of 1000000 table entries and got:

   * using pattern, [ordered_set]: 4532605 us
   * using pattern, [set]              :  4645525 us
   * using pattern, [ordered_set, {keypos, 2}]: 3826 us (!!!)
   * using pattern, [set, {keypos, 2}]: 5714 us (!!!)

   * using guards, [ordered_set]: 12542928 us
   * using guards, [set]:               12310452 us
   * using guards, [ordered_set, {keypos, 2}]: 12365477 us
   * using guards, [set, {keypos, 2}]: 12277839 us

I have initialised the DB with [ ets:insert(t, {N, N,
integer_to_list(N)}) || N <- lists:seq(1, 1000000) ].

I don't know though, how this will translate to Mnesia, but I'd give
select with pattern on the primary key a try.

/Jacob


On 2/26/21 11:03 AM, Vance Shipley wrote:
> If I need to lookup a list of keys which is the better approach? Why?
>
> Fselect = fun(Keys) ->
>         MatchHead = {'_', '$1', '$2'},
>         F = fun(Key) ->
>                 {'=:=', '$1', Key}
>         end,
>         MatchConditions = [list_to_tuple(['or' | lists:map(F, Keys)]),
>         MatchBody = ['$_'],
>         MatchFunction = {MatchHead, MatchConditions, MatchBody},
>         MatchExpression = [MatchFunction],
>         mnesia:select(Table, MatchExpression)
> end,
> mnesia:transaction(Fselect, [Keys]).
>
> Fread = fun F([Key | T], Acc) ->
>                 [R] = mnesia:read(Table, Key),
>                 F(T, [R | Acc]);
>         F([], Acc) ->
>                 lists:reverse(Acc)
> end,
> mnesia:transaction(Fread, [Keys. []]).
>
>
> --
>      -Vance

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20210226/17932e66/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5509 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20210226/17932e66/attachment.bin>


More information about the erlang-questions mailing list