<div dir="ltr">I had same intuition so why I tried maps first. If you know how it is implemented (You can look at compiled bytecode) and know how BEAM works it shows how well is BEAM written or there is something fishy in maps implementation. (It's 18.3.)<div><br></div><div>Pichi</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Mar 30, 2016 at 8:23 PM, Jason Douglas <span dir="ltr"><<a href="mailto:jasond@icloud.com" target="_blank">jasond@icloud.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div>At very large sizes though, wouldn’t the map (with O(1) access) become faster than the trie built by the compiler (which will be proportional to lookup term length, when sufficiently dense)? I used to deal a lot with pre-compiled tries and hash tables in C++, and while I loved tries for space efficiency, a hash table always had the upper hand in performance.</div><div><br></div><div>I can see though why for a small set like this, a carefully tuned implementation like Erlang’s would beat out the map.</div><span class="HOEnZb"><font color="#888888"><div><br></div><div>Jason</div></font></span><div><div class="h5"><br><div><blockquote type="cite"><div>On Mar 30, 2016, at 11:12 AM, Hynek Vychodil <<a href="mailto:vychodil.hynek@gmail.com" target="_blank">vychodil.hynek@gmail.com</a>> wrote:</div><br><div><div dir="ltr">Every time I read a claim about how fast it will be I have urge test it. I had an idea that constant map in a module could be faster than function clause co I test it.<div><br></div><div>I was wrong and RAO is right as usual. Function using function clause seems to be three times faster than using map.</div><div><br></div><div><div>x clause</div><div>+ map</div><div>+--------------------------------------------------------------------------+</div><div>|xxxxx +++++ +|</div><div>|xxxx ++++ |</div><div>|xxxx +++ |</div><div>|xxxx ++ |</div><div>|xxx ++ |</div><div>|xxx ++ |</div><div>|xx ++ |</div><div>|xx ++ |</div><div>|xx ++ |</div><div>|xx + |</div><div>|xx + |</div><div>|xx + |</div><div>|xx + |</div><div>|xx + |</div><div>| x + |</div><div>| x + |</div><div>| x + |</div><div>| x + |</div><div>| x + |</div><div>| x + |</div><div>| x + |</div><div>| x + |</div><div>| x + |</div><div>| x + |</div><div>| x + |</div><div>| + |</div><div>| + |</div><div>| + |</div><div>| + |</div><div>| + |</div><div>| + |</div><div>| + |</div><div>| + |</div><div>| + |</div><div>||A| |</div><div>| |_MA_| |</div><div>+--------------------------------------------------------------------------+</div><div>Dataset: x N=50 CI=95.0000</div><div>Statistic Value [ Bias] (Bootstrapped LB‥UB)</div><div>Min: 3490.00</div><div>1st Qu. 3551.00</div><div>Median: 3591.00</div><div>3rd Qu. 3679.00</div><div>Max: 3945.00</div><div>Average: 3630.16 [ 0.137534] ( 3602.82 ‥ 3664.56)</div><div>Std. Dev: 113.400 [ -1.81311] ( 90.8425 ‥ 141.539)</div><div><br></div><div>Outliers: 0/4 = 4 (μ=3630.30, σ=111.587)</div><div> Outlier variance: 0.151802 (moderate)</div><div><br></div><div>------</div><div><br></div><div>Dataset: + N=50 CI=95.0000</div><div>Statistic Value [ Bias] (Bootstrapped LB‥UB)</div><div>Min: 1.09500e+4</div><div>1st Qu. 1.10160e+4</div><div>Median: 1.10400e+4</div><div>3rd Qu. 1.11270e+4</div><div>Max: 1.28270e+4</div><div>Average: 1.11055e+4 [ 0.297998] ( 1.10611e+4 ‥ 1.12491e+4)</div><div>Std. Dev: 264.914 [ -31.0673] ( 84.7956 ‥ 582.629)</div><div><br></div><div>Outliers: 0/2 = 2 (μ=1.11058e+4, σ=233.847)</div><div> Outlier variance: 9.45082e-2 (slight)</div><div><br></div><div>Difference at 95.0% confidence</div><div> 7475.36 ± 80.8533</div><div> 205.924% ± 2.22726%</div><div> (Student's t, pooled s = 203.763)</div><div>------</div><div><br></div><div>It's about 31 million stopwords_clause:is_stopword/1 per second and 10 million stopwords_map:is_stopword/1 per second.</div><div><br></div><div>You can find code in gist <a href="https://gist.github.com/pichi/2d10c93242d5057913d026a607f07dd4" target="_blank">https://gist.github.com/pichi/2d10c93242d5057913d026a607f07dd4</a></div><div><br></div><div>Pichi</div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Mar 30, 2016 at 4:05 AM, Lloyd R. Prentice <span dir="ltr"><<a href="mailto:lloyd@writersglen.com" target="_blank">lloyd@writersglen.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Wow! What a cool idea.<br>
<br>
Thanks, Richard.<br>
<br>
Best wishes,<br>
<br>
LRP<br>
<br>
Sent from my iPad<br>
<br>
> On Mar 29, 2016, at 8:47 PM, "Richard A. O'Keefe" <<a href="mailto:ok@cs.otago.ac.nz" target="_blank">ok@cs.otago.ac.nz</a>> wrote:<br>
<span>><br>
><br>
>> On 30/03/16 5:59 am, <a href="mailto:lloyd@writersglen.com" target="_blank">lloyd@writersglen.com</a> wrote:<br>
>> So, I have a printed list of stop words:<br>
>><br>
>> <a href="http://www.ranks.nl/stopwords" rel="noreferrer" target="_blank">http://www.ranks.nl/stopwords</a><br>
>><br>
>> I'd like to turn this list into an Erlang function that I can query---<br>
>><br>
>> stopwords() -><br>
>> ["word1", "word2" ... "wordN"].<br>
>><br>
>> is_stopword(Word) -><br>
>> List = stopwords(),<br>
>> lists_member(Word, List).<br>
</span>> Even if there is some arcane reason why you want the collection of words<br>
> as a list, I strongly suggest generating<br>
><br>
> is_stopword("a") -> true;<br>
> is_stopword("about") -> true;<br>
> ...<br>
> is_stopword("yourselves") -> true;<br>
> is_stopword(_) -> false.<br>
><br>
> Open the list of stopwords in vi.<br>
> :1,$s/^.*$/is_stopword("&") -> true;/<br>
> :$a<br>
> is_stopword(_) -> false.<br>
> <ESC><br>
><br>
> The Erlang compiler will turn this into a trie, roughly speaking.<br>
> This will be *dizzyingly* faster than the code you outlined.<br>
<div><div>><br>
><br>
><br>
><br>
>><br>
>> All my efforts so far have evolved into ugly kludges. Seems to me there must be an elegant method that I'm overlooking.<br>
>><br>
>> Some kind soul point the way?<br>
>><br>
>> Many thanks,<br>
>><br>
>> LRP<br>
>><br>
>> *********************************************<br>
>> My books:<br>
>><br>
>> THE GOSPEL OF ASHES<br>
>> <a href="http://thegospelofashes.com/" rel="noreferrer" target="_blank">http://thegospelofashes.com</a><br>
>><br>
>> Strength is not enough. Do they have the courage<br>
>> and the cunning? Can they survive long enough to<br>
>> save the lives of millions?<br>
>><br>
>> FREEIN' PANCHO<br>
>> <a href="http://freeinpancho.com/" rel="noreferrer" target="_blank">http://freeinpancho.com</a><br>
>><br>
>> A community of misfits help a troubled boy find his way<br>
>><br>
>> AYA TAKEO<br>
>> <a href="http://ayatakeo.com/" rel="noreferrer" target="_blank">http://ayatakeo.com</a><br>
>><br>
>> Star-crossed love, war and power in an alternative<br>
>> universe<br>
>><br>
>> Available through Amazon or by request from your<br>
>> favorite bookstore<br>
>><br>
>><br>
>> **********************************************<br>
>><br>
>> _______________________________________________<br>
>> erlang-questions mailing list<br>
>> <a href="mailto:erlang-questions@erlang.org" target="_blank">erlang-questions@erlang.org</a><br>
>> <a href="http://erlang.org/mailman/listinfo/erlang-questions" rel="noreferrer" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br>
><br>
_______________________________________________<br>
erlang-questions mailing list<br>
<a href="mailto:erlang-questions@erlang.org" target="_blank">erlang-questions@erlang.org</a><br>
<a href="http://erlang.org/mailman/listinfo/erlang-questions" rel="noreferrer" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br>
</div></div></blockquote></div><br></div></div></div>
_______________________________________________<br>erlang-questions mailing list<br><a href="mailto:erlang-questions@erlang.org" target="_blank">erlang-questions@erlang.org</a><br><a href="http://erlang.org/mailman/listinfo/erlang-questions" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br></div></blockquote></div><br></div></div></div></blockquote></div><br></div>