<div dir="ltr"><div><span style="font-size:16px">[Sorry for the duplicate, Chaitanya - I meant to reply to the list]</span></div><span style="font-size:16px"><div><span style="font-size:16px"><br></span></div>I may be missing something, but what's wrong with mneisa:[dirty_]select/2?</span><br style="font-size:16px"><br style="font-size:16px"><span style="font-size:16px">mnesia:dirty_select(my_table, [{#my_record{uuid=UUID, _='_'}, [], [true]}])</span><div style="font-size:16px"><br></div><div style="font-size:16px">Pro: Works on any table type, no large row copying, no extra storage, automatically uses index if available.</div><div style="font-size:16px">Cons: Only about 3 people on the planet can write matchspecs off the top of their head.</div><div style="font-size:16px"><br></div><div style="font-size:16px">will return [true] where the UUID is present, and [] otherwise. No issue with big rows, nor with lots of entries.</div><div style="font-size:16px"><br></div><div style="font-size:16px">If you're feeling really adventurous, you can even do all the UUIDs in one call:</div><div style="font-size:16px"><br></div><div style="font-size:16px">mnesia:dirty_select(my_table, [{#my_record{uuid=UUID, _='_'}, [], [true]} || UUID <- MyListOfUUIDs])</div><div style="font-size:16px"><br></div><div style="font-size:16px">If the reuslt is [], none were present. If it's equal in length to MyListOfUUIDs then they were all present.</div><div class="" style="font-size:16px"><div id=":14u" class="" tabindex="0"><img class="" src="https://ssl.gstatic.com/ui/v1/icons/mail/images/cleardot.gif" style=""></div></div><span class="" style="font-size:16px"><font color="#888888"><br></font></span><div class="" style="font-size:16px"><span class=""><font color="#888888">B</font></span></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Jul 17, 2016 at 2:47 PM, Chaitanya Chalasani <span dir="ltr"><<a href="mailto:cchalasani@me.com" target="_blank">cchalasani@me.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 16-Jul-2016, at 19:56, Mikael Pettersson <<a href="mailto:mikpelinux@gmail.com">mikpelinux@gmail.com</a>> wrote:<br>
> all_keys can be horribly expensive and should be avoided if possible, but for small tables it may be acceptable.<br>
><br>
> I'd do one of the following:<br>
><br>
> 1. mnesia:dirty_read(T, K) and check result for [] vs [_|_]<br>
> Pro: easy, works<br>
> Con: the data copy may be expensive for large records<br>
<br>
</span>Yes Indeed.<br>
<span class=""><br>
><br>
> 2. Make the table an ordered_set; mnesia:dirty_prev(T, mnesia:dirty_next(T, K)) and check if K is returned<br>
> Pro: avoids the data copy<br>
</span>> Con: requires an ordered_set, requires code to handle boundary conditions wrt '$end_of_table’<br>
<br>
Using UUID as primary key, the ordered_set might eventually slow down my writes.<br>
<span class=""><br>
><br>
> 3. Store the keys w/o data in a separate table, then do a dirty_read in that<br>
> Pro: reduces copying<br>
> Con: requires more storage, the lookup in the side table won't provide cache hints to help your access<br>
> in the main table (but that may be Ok if the side table is hit orders of magnitude more often)<br>
><br>
> One could implement some sort of sparse bitmap or range tree and use that to record key presence, but I'm<br>
> not sure it would be worthwhile in Erlang.<br>
<br>
</span>Yes, I am looking into this possibility as Eric also has suggested the same approach. I can think of using bitmap if it doesn’t complicate the solution beyond the performance again.<br>
<br>
Also, when I was going through my use case, I figured out the chance of a table being remote is rare enough to make peace with ets:member and tried to implement as shown below -<br>
<br>
is_key(Tname, Key) -><br>
case catch ets:member(Tname, Key) of<br>
{'EXIT', _Reason} -><br>
is_remote_key(Tname, Key);<br>
Boolean -> Boolean<br>
end.<br>
<br>
is_remote_key(Tname, Key) -><br>
case mnesia:dirty_read(Tname, Key) of<br>
[] -> false;<br>
_ -> true<br>
end.<br>
<br>
are_all_keys(Tname, Keys) -><br>
Fun = case mnesia:table_info(Tname, storage_type) of<br>
unknown -> fun is_remote_key/2;<br>
_ -> fun is_key/2<br>
end,<br>
are_all_keys(Tname, Keys, Fun).<br>
<br>
are_all_keys(_Tname, [], _Fun) -> true;<br>
are_all_keys(Tname, [Key|Keys], Fun) -><br>
case Fun(Tname, Key) of<br>
false -> false;<br>
true -> are_all_keys(Tname, Keys, Fun)<br>
end.<br>
<br>
Below are the latencies when checked with timer:tc.<br>
<br>
*** Table has a local copy ***<br>
13> timer:tc(mnesiaKeys, are_all_keys, [test, [1,2,3]]).<br>
{11,true}<br>
14> timer:tc(mnesiaKeys, are_all_keys, [test, [1,2,3,2,1,1,2,3]]).<br>
{14,true}<br>
15> timer:tc(mnesiaKeys, are_all_keys, [test, [1,2,3,2,3,1,2,3,1]]).<br>
{14,true}<br>
16> timer:tc(mnesiaKeys, are_all_keys, [test, [1,2,3,2,3,1,2,3,1]]).<br>
{13,true}<br>
17> timer:tc(mnesiaKeys, are_all_keys, [test, [1,2,3,2,3,1,2,3,1]]).<br>
{13,true}<br>
18> timer:tc(mnesiaKeys, are_all_keys, [test, [1,2,3,2,3,1,2,3,1]]).<br>
{14,true}<br>
<br>
*** Table is remote ***<br>
9> timer:tc(mnesiaKeys, are_all_keys, [test, [1,2,3]]).<br>
{975,true}<br>
10> timer:tc(mnesiaKeys, are_all_keys, [test, [1,2,3,2,3,1,2,3,1]]).<br>
{2151,true}<br>
11> timer:tc(mnesiaKeys, are_all_keys, [test, [1,2,3,2,3,1,2,3,1]]).<br>
{2003,true}<br>
12> timer:tc(mnesiaKeys, are_all_keys, [test, [1,2,3,2,3,1,2,3,1]]).<br>
{1898,true}<br>
13> timer:tc(mnesiaKeys, are_all_keys, [test, [1,2,3,2,3,1,2,3,1]]).<br>
{2027,true}<br>
<br>
Though I didn’t use UUIDs in my example I think this is optimized enough. Please suggest otherwise.<br>
<span class="HOEnZb"><font color="#888888"><br>
<br>
/Chaitanya<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
_______________________________________________<br>
erlang-questions mailing list<br>
<a href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a><br>
<a href="http://erlang.org/mailman/listinfo/erlang-questions" rel="noreferrer" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br>
</div></div></blockquote></div><br></div>