[erlang-questions] Printed list to Erlang function
Hynek Vychodil
vychodil.hynek@REDACTED
Wed Mar 30 22:04:52 CEST 2016
There is result for long list (667 words):
x clause
+ map
+--------------------------------------------------------------------------+
| xxx x x +++ + + +|
| xxx x +++ |
| xxx +++ |
| xxx +++ |
| xx +++ |
| xx ++ |
| xx ++ |
| xx ++ |
| xx + |
| xx + |
| xx + |
| xx + |
| xx + |
| xx + |
| xx + |
| xx + |
| xx + |
| xx + |
| xx + |
| x + |
| x + |
| x + |
| x + |
| x + |
| + |
| + |
| + |
| + |
| + |
| + |
| + |
| + |
| + |
| + |
||_A_| |
| |___MA____| |
+--------------------------------------------------------------------------+
Dataset: x N=50 CI=95.0000
Statistic Value [ Bias] (Bootstrapped LB‥UB)
Min: 5087.00
1st Qu. 5113.00
Median: 5137.00
3rd Qu. 5188.00
Max: 7081.00
Average: 5205.64 [ 0.729718] ( 5157.08 ‥ 5372.30)
Std. Dev: 287.752 [ -33.4550] ( 81.6038 ‥ 633.923)
Outliers: 0/4 = 4 (μ=5206.37, σ=254.297)
Outlier variance: 0.365232 (moderate)
------
Dataset: + N=50 CI=95.0000
Statistic Value [ Bias] (Bootstrapped LB‥UB)
Min: 1.13720e+4
1st Qu. 1.14450e+4
Median: 1.14890e+4
3rd Qu. 1.15510e+4
Max: 1.51180e+4
Average: 1.16464e+4 [ -0.578036] ( 1.15250e+4 ‥ 1.19671e+4)
Std. Dev: 661.815 [ -48.2839] ( 336.017 ‥ 1217.81)
Outliers: 0/3 = 3 (μ=1.16458e+4, σ=613.531)
Outlier variance: 0.384516 (moderate)
Difference at 95.0% confidence
6440.78 ± 202.485
123.727% ± 3.88972%
(Student's t, pooled s = 510.294)
------
It is still faster when using function clause and performs nice 22 million
calls per second.
Pichi
On Wed, Mar 30, 2016 at 8:33 PM, Lloyd R. Prentice <lloyd@REDACTED>
wrote:
> Hi Pichi,
>
> Since I haven't learned yet how to design and conduct performance tests,
> results like these are both interesting and comforting.
>
> The long stop words list in http://www.ranks.nl/stopwords has something
> less than 700 words. So from these results it looks like either method
> would do the job in most applications, unless you are filtering stop words
> out of a huge archive of long documents.
>
> Many thanks, Pichi.
>
> Best wishes,
>
> LRP
>
> Sent from my iPad
>
> On Mar 30, 2016, at 2:12 PM, Hynek Vychodil <vychodil.hynek@REDACTED>
> wrote:
>
> Every time I read a claim about how fast it will be I have urge test it. I
> had an idea that constant map in a module could be faster than function
> clause co I test it.
>
> I was wrong and RAO is right as usual. Function using function clause
> seems to be three times faster than using map.
>
> x clause
> + map
>
> +--------------------------------------------------------------------------+
> |xxxxx +++++
> +|
> |xxxx ++++
> |
> |xxxx +++
> |
> |xxxx ++
> |
> |xxx ++
> |
> |xxx ++
> |
> |xx ++
> |
> |xx ++
> |
> |xx ++
> |
> |xx +
> |
> |xx +
> |
> |xx +
> |
> |xx +
> |
> |xx +
> |
> | x +
> |
> | x +
> |
> | x +
> |
> | x +
> |
> | x +
> |
> | x +
> |
> | x +
> |
> | x +
> |
> | x +
> |
> | x +
> |
> | x +
> |
> | +
> |
> | +
> |
> | +
> |
> | +
> |
> | +
> |
> | +
> |
> | +
> |
> | +
> |
> | +
> |
> ||A|
> |
> | |_MA_|
> |
>
> +--------------------------------------------------------------------------+
> Dataset: x N=50 CI=95.0000
> Statistic Value [ Bias] (Bootstrapped LB‥UB)
> Min: 3490.00
> 1st Qu. 3551.00
> Median: 3591.00
> 3rd Qu. 3679.00
> Max: 3945.00
> Average: 3630.16 [ 0.137534] ( 3602.82 ‥ 3664.56)
> Std. Dev: 113.400 [ -1.81311] ( 90.8425 ‥ 141.539)
>
> Outliers: 0/4 = 4 (μ=3630.30, σ=111.587)
> Outlier variance: 0.151802 (moderate)
>
> ------
>
> Dataset: + N=50 CI=95.0000
> Statistic Value [ Bias] (Bootstrapped LB‥UB)
> Min: 1.09500e+4
> 1st Qu. 1.10160e+4
> Median: 1.10400e+4
> 3rd Qu. 1.11270e+4
> Max: 1.28270e+4
> Average: 1.11055e+4 [ 0.297998] ( 1.10611e+4 ‥ 1.12491e+4)
> Std. Dev: 264.914 [ -31.0673] ( 84.7956 ‥ 582.629)
>
> Outliers: 0/2 = 2 (μ=1.11058e+4, σ=233.847)
> Outlier variance: 9.45082e-2 (slight)
>
> Difference at 95.0% confidence
> 7475.36 ± 80.8533
> 205.924% ± 2.22726%
> (Student's t, pooled s = 203.763)
> ------
>
> It's about 31 million stopwords_clause:is_stopword/1 per second and 10
> million stopwords_map:is_stopword/1 per second.
>
> You can find code in gist
> https://gist.github.com/pichi/2d10c93242d5057913d026a607f07dd4
>
> Pichi
>
> On Wed, Mar 30, 2016 at 4:05 AM, Lloyd R. Prentice <lloyd@REDACTED>
> wrote:
>
>> Wow! What a cool idea.
>>
>> Thanks, Richard.
>>
>> Best wishes,
>>
>> LRP
>>
>> Sent from my iPad
>>
>> > On Mar 29, 2016, at 8:47 PM, "Richard A. O'Keefe" <ok@REDACTED>
>> wrote:
>> >
>> >
>> >> On 30/03/16 5:59 am, lloyd@REDACTED wrote:
>> >> So, I have a printed list of stop words:
>> >>
>> >> http://www.ranks.nl/stopwords
>> >>
>> >> I'd like to turn this list into an Erlang function that I can query---
>> >>
>> >> stopwords() ->
>> >> ["word1", "word2" ... "wordN"].
>> >>
>> >> is_stopword(Word) ->
>> >> List = stopwords(),
>> >> lists_member(Word, List).
>> > Even if there is some arcane reason why you want the collection of words
>> > as a list, I strongly suggest generating
>> >
>> > is_stopword("a") -> true;
>> > is_stopword("about") -> true;
>> > ...
>> > is_stopword("yourselves") -> true;
>> > is_stopword(_) -> false.
>> >
>> > Open the list of stopwords in vi.
>> > :1,$s/^.*$/is_stopword("&") -> true;/
>> > :$a
>> > is_stopword(_) -> false.
>> > <ESC>
>> >
>> > The Erlang compiler will turn this into a trie, roughly speaking.
>> > This will be *dizzyingly* faster than the code you outlined.
>> >
>> >
>> >
>> >
>> >>
>> >> All my efforts so far have evolved into ugly kludges. Seems to me
>> there must be an elegant method that I'm overlooking.
>> >>
>> >> Some kind soul point the way?
>> >>
>> >> Many thanks,
>> >>
>> >> LRP
>> >>
>> >> *********************************************
>> >> My books:
>> >>
>> >> THE GOSPEL OF ASHES
>> >> http://thegospelofashes.com
>> >>
>> >> Strength is not enough. Do they have the courage
>> >> and the cunning? Can they survive long enough to
>> >> save the lives of millions?
>> >>
>> >> FREEIN' PANCHO
>> >> http://freeinpancho.com
>> >>
>> >> A community of misfits help a troubled boy find his way
>> >>
>> >> AYA TAKEO
>> >> http://ayatakeo.com
>> >>
>> >> Star-crossed love, war and power in an alternative
>> >> universe
>> >>
>> >> Available through Amazon or by request from your
>> >> favorite bookstore
>> >>
>> >>
>> >> **********************************************
>> >>
>> >> _______________________________________________
>> >> erlang-questions mailing list
>> >> erlang-questions@REDACTED
>> >> http://erlang.org/mailman/listinfo/erlang-questions
>> >
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20160330/6e028b78/attachment.htm>
More information about the erlang-questions
mailing list