<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Hi,</p>
<p>that's a good hint, thanks. I indeed have used adjacent keys in
my "benchmarks" and the distance between keys has indeed a big
influence on the times.</p>
<p>In addition I did the lookup measurements in the shell, which was
unfair since the select call was just a function call while the
lookup loop was executed by the shell.</p>
<p>So the numbers using a compiled module are</p>
<p>* using lookups: [set, {keypos, 2}]: 295 us<br>
* using lookup: [ordered_set, {keypos, 2}]: 221 us</p>
* using pattern, [ordered_set, {keypos, 2}]: 1206 us<br>
* using pattern, [set, {keypos, 2}]: 4570 us
<p>Measurements with a varying number of keys and a varying distance
between keys:<br>
</p>
<p>nkeys skip pat/ordset pat/set lu/ordset
lu/set<br>
</p>
<p> 100 1 261 390 128
63<br>
100 10 2482 382 154 66<br>
100 100 10348 488 100 76<br>
200 1 222 948 128 146<br>
200 10 3758 618 188 141<br>
200 100 37582 666 163 149<br>
500 1 1206 4570 221 295<br>
500 10 22419 3967 265 232<br>
500 100 224580 4062 445 205<br>
1000 1 5489 15148 251 227<br>
1000 10 89676 15077 654 370<br>
1000 100 889965 15941 763 302<br>
2000 1 19062 57493 526 439<br>
2000 10 341417 57467 1063 457<br>
2000 100 3542462 59253 1850 547<br>
5000 1 114807 359162 1589 1477<br>
5000 10 2136702 362625 2954 1853<br>
5000 100 22301409 357902 4933 2424<br>
<br>
nkeys: number of keys per query<br>
skip: distance between subsequent keys ( Keys = lists:seq(1,
Skip * Nkeys, Skip) <br>
pat: select with key in pattern<br>
lu: lookup loop<br>
set: ETS type set<br>
ordset: ETS type ordered_set<br>
</p>
<p>Long story short: Follow Dan Gudmundsson suggestion for ETS based
operations, it's much faster, scales better and suffers less from
non-adjacent keys.</p>
<p>On the other hand, the results might be different with
distributed Mnesia and disk-only tables.</p>
<p>/Jacob<br>
</p>
<div class="moz-cite-prefix">On 2/26/21 6:49 PM, Sverker Eriksson
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:AM7PR07MB6900F22E02092FF9D9D1B79AFD9D9@AM7PR07MB6900.eurprd07.prod.outlook.com">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US">One thing to be aware of
when doing benchmarks on ETS set vs ordered_set is that
ordered_set may get boosted cache performance if the keys
are accessed in term order. The same internal tree nodes are
visited over and over again as term adjacent key are most
often close to each other in the tree.</span></p>
<p class="MsoNormal"><span lang="EN-US">So if that is not your
typical traffic case, the benchmark should access the keys
in some random order.</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">/Sverker</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<div>
<div>
<p class="MsoNormal"><b><span lang="EN-US">From:</span></b><span
lang="EN-US"> erlang-questions
<a class="moz-txt-link-rfc2396E" href="mailto:erlang-questions-bounces@erlang.org"><erlang-questions-bounces@erlang.org></a> <b>On
Behalf Of </b>Jacob<br>
<b>Sent:</b> den 26 februari 2021 16:35<br>
<b>To:</b> Dan Gudmundsson <a class="moz-txt-link-rfc2396E" href="mailto:dangud@gmail.com"><dangud@gmail.com></a><br>
<b>Cc:</b> Questions erlang-questions
<a class="moz-txt-link-rfc2396E" href="mailto:erlang-questions@erlang.org"><erlang-questions@erlang.org></a><br>
<b>Subject:</b> Re: Performance of mnesia:select/2</span></p>
</div>
</div>
<p class="MsoNormal"> </p>
<p> </p>
<div>
<p class="MsoNormal">On 2/26/21 3:53 PM, Dan Gudmundsson
wrote:</p>
</div>
<blockquote>
<div>
<p class="MsoNormal">Interesting, and times do you get for
500 ets lookup on that data?</p>
</div>
</blockquote>
<p>Oh, I really should have tested that as well:</p>
<p>timer:tc(fun () -> lists:map(fun(K) -> ets:lookup(t, K)
end, Keys) end).</p>
<p>* using lookups: [set, {keypos, 2}]: 3148 us<br>
* using lookup: [ordered_set, {keypos, 2}]: 3423 us</p>
<p>I have repeated the select/pattern runs to have comparable
results:</p>
<p>* using pattern, [ordered_set, {keypos, 2}]: 3173 us<br>
* using pattern, [set, {keypos, 2}]: 8091 us</p>
<p>Note that there was quite a high jitter on the measurements
so I have compute the mean value of 3 measurements each.</p>
<p>So lookups and pattern/ordered_set were too close to really
come to a winner here. I had the impression that the variance
was slightly higher with lookup, but digging into this would
require a better measurement setup than what I have been
using.</p>
<p> </p>
<blockquote>
<p class="MsoNormal"> </p>
<div>
<div>
<p class="MsoNormal">On Fri, Feb 26, 2021 at 3:12 PM Jacob
<<a href="mailto:jacob01@gmx.net"
moz-do-not-send="true">jacob01@gmx.net</a>> wrote:</p>
</div>
<blockquote>
<p class="MsoNormal">Hi,<br>
<br>
assuming that the match spec compiler does clever things
with patterns,<br>
I'd use select with the following<br>
<br>
MatchExpression = [ {{'_', K, '_'}, [], ['$_']} || K
<- Keys ]<br>
<br>
If did some quick measurements with plain ETS, timer:tc
and 500 keys out<br>
of 1000000 table entries and got:<br>
<br>
* using pattern, [ordered_set]: 4532605 us<br>
* using pattern, [set] : 4645525 us<br>
* using pattern, [ordered_set, {keypos, 2}]: 3826 us
(!!!)<br>
* using pattern, [set, {keypos, 2}]: 5714 us (!!!)<br>
<br>
* using guards, [ordered_set]: 12542928 us<br>
* using guards, [set]: 12310452 us<br>
* using guards, [ordered_set, {keypos, 2}]: 12365477
us<br>
* using guards, [set, {keypos, 2}]: 12277839 us<br>
<br>
I have initialised the DB with [ ets:insert(t, {N, N,<br>
integer_to_list(N)}) || N <- lists:seq(1, 1000000) ].<br>
<br>
I don't know though, how this will translate to Mnesia,
but I'd give<br>
select with pattern on the primary key a try.<br>
<br>
/Jacob<br>
<br>
<br>
On 2/26/21 11:03 AM, Vance Shipley wrote:<br>
> If I need to lookup a list of keys which is the
better approach? Why?<br>
><br>
> Fselect = fun(Keys) -><br>
> MatchHead = {'_', '$1', '$2'},<br>
> F = fun(Key) -><br>
> {'=:=', '$1', Key}<br>
> end,<br>
> MatchConditions = [list_to_tuple(['or' |
lists:map(F, Keys)]),<br>
> MatchBody = ['$_'],<br>
> MatchFunction = {MatchHead,
MatchConditions, MatchBody},<br>
> MatchExpression = [MatchFunction],<br>
> mnesia:select(Table, MatchExpression)<br>
> end,<br>
> mnesia:transaction(Fselect, [Keys]).<br>
><br>
> Fread = fun F([Key | T], Acc) -><br>
> [R] = mnesia:read(Table, Key),<br>
> F(T, [R | Acc]);<br>
> F([], Acc) -><br>
> lists:reverse(Acc)<br>
> end,<br>
> mnesia:transaction(Fread, [Keys. []]).<br>
><br>
><br>
> --<br>
> -Vance</p>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</body>
</html>