<html>


  <head>


    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">


  </head>


  <body>


    <p>Hi,</p>


    <p>that's a good hint, thanks. I indeed have used adjacent keys in


      my "benchmarks" and the distance between keys has indeed a big


      influence on the times.</p>


    <p>In addition I did the lookup measurements in the shell, which was


      unfair since the select call was just a function call while the


      lookup loop was executed by the shell.</p>


    <p>So the numbers using a compiled module are</p>


    <p>* using lookups: [set, {keypos, 2}]:               295 us<br>


      * using lookup: [ordered_set, {keypos, 2}]:  221 us</p>


    * using pattern, [ordered_set, {keypos, 2}]: 1206 us<br>


    * using pattern, [set, {keypos, 2}]:               4570 us


    <p>Measurements with a varying number of keys and a varying distance


      between keys:<br>


    </p>


    <p>nkeys      skip          pat/ordset     pat/set    lu/ordset   


      lu/set<br>


    </p>


    <p> 100          1             261          390        128        


      63<br>


       100         10            2482          382        154         66<br>


       100        100           10348          488        100         76<br>


       200          1             222          948        128        146<br>


       200         10            3758          618        188        141<br>


       200        100           37582          666        163        149<br>


       500          1            1206         4570        221        295<br>


       500         10           22419         3967        265        232<br>


       500        100          224580         4062        445        205<br>


      1000          1            5489        15148        251        227<br>


      1000         10           89676        15077        654        370<br>


      1000        100          889965        15941        763        302<br>


      2000          1           19062        57493        526        439<br>


      2000         10          341417        57467       1063        457<br>


      2000        100         3542462        59253       1850        547<br>


      5000          1          114807       359162       1589       1477<br>


      5000         10         2136702       362625       2954       1853<br>


      5000        100        22301409       357902       4933       2424<br>


      <br>


      nkeys: number of keys per query<br>


      skip:    distance between subsequent keys  ( Keys = lists:seq(1,


      Skip * Nkeys, Skip) <br>


      pat:     select with key in pattern<br>


      lu:       lookup loop<br>


      set:     ETS type set<br>


      ordset:  ETS type ordered_set<br>


    </p>


    <p>Long story short: Follow Dan Gudmundsson suggestion for ETS based


      operations, it's much faster, scales better and suffers less from


      non-adjacent keys.</p>


    <p>On the other hand, the results might be different with


      distributed Mnesia and disk-only tables.</p>


    <p>/Jacob<br>


    </p>


    <div class="moz-cite-prefix">On 2/26/21 6:49 PM, Sverker Eriksson


      wrote:<br>


    </div>


    <blockquote type="cite"


cite="mid:AM7PR07MB6900F22E02092FF9D9D1B79AFD9D9@AM7PR07MB6900.eurprd07.prod.outlook.com">


      <meta name="Generator" content="Microsoft Word 15 (filtered


        medium)">


      <div class="WordSection1">


        <p class="MsoNormal"><span lang="EN-US">One thing to be aware of


            when doing benchmarks on ETS set vs ordered_set is that


            ordered_set may get boosted cache performance if the keys


            are accessed in term order. The same internal tree nodes are


            visited over and over again as term adjacent key are most


            often close to each other in the tree.</span></p>


        <p class="MsoNormal"><span lang="EN-US">So if that is not your


            typical traffic case, the benchmark should access the keys


            in some random order.</span></p>


        <p class="MsoNormal"><span lang="EN-US"> </span></p>


        <p class="MsoNormal"><span lang="EN-US">/Sverker</span></p>


        <p class="MsoNormal"><span lang="EN-US">  </span></p>


        <p class="MsoNormal"><span lang="EN-US"> </span></p>


        <div>


          <div>


            <p class="MsoNormal"><b><span lang="EN-US">From:</span></b><span


                lang="EN-US"> erlang-questions


                <a class="moz-txt-link-rfc2396E" href="mailto:erlang-questions-bounces@erlang.org"><erlang-questions-bounces@erlang.org></a> <b>On


                  Behalf Of </b>Jacob<br>


                <b>Sent:</b> den 26 februari 2021 16:35<br>


                <b>To:</b> Dan Gudmundsson <a class="moz-txt-link-rfc2396E" href="mailto:dangud@gmail.com"><dangud@gmail.com></a><br>


                <b>Cc:</b> Questions erlang-questions


                <a class="moz-txt-link-rfc2396E" href="mailto:erlang-questions@erlang.org"><erlang-questions@erlang.org></a><br>


                <b>Subject:</b> Re: Performance of mnesia:select/2</span></p>


          </div>


        </div>


        <p class="MsoNormal"> </p>


        <p> </p>


        <div>


          <p class="MsoNormal">On 2/26/21 3:53 PM, Dan Gudmundsson


            wrote:</p>


        </div>


        <blockquote>


          <div>


            <p class="MsoNormal">Interesting, and times do you get for


              500 ets lookup on that data?</p>


          </div>


        </blockquote>


        <p>Oh, I really should have tested that as well:</p>


        <p>timer:tc(fun () -> lists:map(fun(K) -> ets:lookup(t, K)


          end, Keys) end).</p>


        <p>* using lookups: [set, {keypos, 2}]:               3148 us<br>


          * using lookup: [ordered_set, {keypos, 2}]:  3423 us</p>


        <p>I have repeated the select/pattern runs to have comparable


          results:</p>


        <p>* using pattern, [ordered_set, {keypos, 2}]: 3173 us<br>


          * using pattern, [set, {keypos, 2}]:               8091 us</p>


        <p>Note that there was quite a high jitter on the measurements


          so I have compute the mean value of 3 measurements each.</p>


        <p>So lookups and pattern/ordered_set were too close to really


          come to a winner here. I had the impression that the variance


          was slightly higher with lookup, but digging into this would


          require a better measurement setup than what I have been


          using.</p>


        <p> </p>


        <blockquote>


          <p class="MsoNormal"> </p>


          <div>


            <div>


              <p class="MsoNormal">On Fri, Feb 26, 2021 at 3:12 PM Jacob


                <<a href="mailto:jacob01@gmx.net"


                  moz-do-not-send="true">jacob01@gmx.net</a>> wrote:</p>


            </div>


            <blockquote>


              <p class="MsoNormal">Hi,<br>


                <br>


                assuming that the match spec compiler does clever things


                with patterns,<br>


                I'd use select with the following<br>


                <br>


                   MatchExpression = [ {{'_', K, '_'}, [], ['$_']} || K


                <- Keys ]<br>


                <br>


                If did some quick measurements with plain ETS, timer:tc


                and 500 keys out<br>


                of 1000000 table entries and got:<br>


                <br>


                   * using pattern, [ordered_set]: 4532605 us<br>


                   * using pattern, [set]              :  4645525 us<br>


                   * using pattern, [ordered_set, {keypos, 2}]: 3826 us


                (!!!)<br>


                   * using pattern, [set, {keypos, 2}]: 5714 us (!!!)<br>


                <br>


                   * using guards, [ordered_set]: 12542928 us<br>


                   * using guards, [set]:               12310452 us<br>


                   * using guards, [ordered_set, {keypos, 2}]: 12365477


                us<br>


                   * using guards, [set, {keypos, 2}]: 12277839 us<br>


                <br>


                I have initialised the DB with [ ets:insert(t, {N, N,<br>


                integer_to_list(N)}) || N <- lists:seq(1, 1000000) ].<br>


                <br>


                I don't know though, how this will translate to Mnesia,


                but I'd give<br>


                select with pattern on the primary key a try.<br>


                <br>


                /Jacob<br>


                <br>


                <br>


                On 2/26/21 11:03 AM, Vance Shipley wrote:<br>


                > If I need to lookup a list of keys which is the


                better approach? Why?<br>


                ><br>


                > Fselect = fun(Keys) -><br>


                >         MatchHead = {'_', '$1', '$2'},<br>


                >         F = fun(Key) -><br>


                >                 {'=:=', '$1', Key}<br>


                >         end,<br>


                >         MatchConditions = [list_to_tuple(['or' |


                lists:map(F, Keys)]),<br>


                >         MatchBody = ['$_'],<br>


                >         MatchFunction = {MatchHead,


                MatchConditions, MatchBody},<br>


                >         MatchExpression = [MatchFunction],<br>


                >         mnesia:select(Table, MatchExpression)<br>


                > end,<br>


                > mnesia:transaction(Fselect, [Keys]).<br>


                ><br>


                > Fread = fun F([Key | T], Acc) -><br>


                >                 [R] = mnesia:read(Table, Key),<br>


                >                 F(T, [R | Acc]);<br>


                >         F([], Acc) -><br>


                >                 lists:reverse(Acc)<br>


                > end,<br>


                > mnesia:transaction(Fread, [Keys. []]).<br>


                ><br>


                ><br>


                > --<br>


                >      -Vance</p>


            </blockquote>


          </div>


        </blockquote>


      </div>


    </blockquote>


  </body>


</html>