[erlang-questions] Printed list to Erlang function
Sverker Eriksson
sverker.eriksson@REDACTED
Thu Mar 31 21:31:44 CEST 2016
Small maps uses an array with linear search for key.
Large maps (> 32 keys) uses a HAMT tree with a maximum branching factor
of 16.
https://en.wikipedia.org/wiki/Hash_array_mapped_trie
/Sverker, Erlang/OTP
On 03/30/2016 10:11 PM, Hynek Vychodil wrote:
> I had same intuition so why I tried maps first. If you know how it is
> implemented (You can look at compiled bytecode) and know how BEAM
> works it shows how well is BEAM written or there is something fishy in
> maps implementation. (It's 18.3.)
>
> Pichi
>
> On Wed, Mar 30, 2016 at 8:23 PM, Jason Douglas <jasond@REDACTED
> <mailto:jasond@REDACTED>> wrote:
>
> At very large sizes though, wouldn’t the map (with O(1) access)
> become faster than the trie built by the compiler (which will be
> proportional to lookup term length, when sufficiently dense)? I
> used to deal a lot with pre-compiled tries and hash tables in C++,
> and while I loved tries for space efficiency, a hash table always
> had the upper hand in performance.
>
> I can see though why for a small set like this, a carefully tuned
> implementation like Erlang’s would beat out the map.
>
> Jason
>
>> On Mar 30, 2016, at 11:12 AM, Hynek Vychodil
>> <vychodil.hynek@REDACTED <mailto:vychodil.hynek@REDACTED>> wrote:
>>
>> Every time I read a claim about how fast it will be I have urge
>> test it. I had an idea that constant map in a module could be
>> faster than function clause co I test it.
>>
>> I was wrong and RAO is right as usual. Function using function
>> clause seems to be three times faster than using map.
>>
>> x clause
>> + map
>> +--------------------------------------------------------------------------+
>> |xxxxx +++++ +|
>> |xxxx ++++ |
>> |xxxx +++ |
>> |xxxx ++ |
>> |xxx ++ |
>> |xxx ++ |
>> |xx ++ |
>> |xx ++ |
>> |xx ++ |
>> |xx + |
>> |xx + |
>> |xx + |
>> |xx + |
>> |xx + |
>> | x + |
>> | x + |
>> | x + |
>> | x + |
>> | x + |
>> | x + |
>> | x + |
>> | x + |
>> | x + |
>> | x + |
>> | x + |
>> | + |
>> | + |
>> | + |
>> | + |
>> | + |
>> | + |
>> | + |
>> | + |
>> | + |
>> ||A| |
>> | |_MA_| |
>> +--------------------------------------------------------------------------+
>> Dataset: x N=50 CI=95.0000
>> Statistic Value [ Bias] (Bootstrapped LB‥UB)
>> Min: 3490.00
>> 1st Qu. 3551.00
>> Median: 3591.00
>> 3rd Qu. 3679.00
>> Max: 3945.00
>> Average: 3630.16 [ 0.137534] ( 3602.82 ‥
>> 3664.56)
>> Std. Dev: 113.400 [ -1.81311] ( 90.8425 ‥
>> 141.539)
>>
>> Outliers: 0/4 = 4 (μ=3630.30, σ=111.587)
>> Outlier variance: 0.151802 (moderate)
>>
>> ------
>>
>> Dataset: + N=50 CI=95.0000
>> Statistic Value [ Bias] (Bootstrapped LB‥UB)
>> Min: 1.09500e+4
>> 1st Qu. 1.10160e+4
>> Median: 1.10400e+4
>> 3rd Qu. 1.11270e+4
>> Max: 1.28270e+4
>> Average: 1.11055e+4 [ 0.297998] ( 1.10611e+4 ‥
>> 1.12491e+4)
>> Std. Dev: 264.914 [ -31.0673] ( 84.7956 ‥
>> 582.629)
>>
>> Outliers: 0/2 = 2 (μ=1.11058e+4, σ=233.847)
>> Outlier variance: 9.45082e-2 (slight)
>>
>> Difference at 95.0% confidence
>> 7475.36 ± 80.8533
>> 205.924% ± 2.22726%
>> (Student's t, pooled s = 203.763)
>> ------
>>
>> It's about 31 million stopwords_clause:is_stopword/1 per second
>> and 10 million stopwords_map:is_stopword/1 per second.
>>
>> You can find code in gist
>> https://gist.github.com/pichi/2d10c93242d5057913d026a607f07dd4
>>
>> Pichi
>>
>> On Wed, Mar 30, 2016 at 4:05 AM, Lloyd R. Prentice
>> <lloyd@REDACTED <mailto:lloyd@REDACTED>> wrote:
>>
>> Wow! What a cool idea.
>>
>> Thanks, Richard.
>>
>> Best wishes,
>>
>> LRP
>>
>> Sent from my iPad
>>
>> > On Mar 29, 2016, at 8:47 PM, "Richard A. O'Keefe"
>> <ok@REDACTED <mailto:ok@REDACTED>> wrote:
>> >
>> >
>> >> On 30/03/16 5:59 am, lloyd@REDACTED
>> <mailto:lloyd@REDACTED> wrote:
>> >> So, I have a printed list of stop words:
>> >>
>> >> http://www.ranks.nl/stopwords
>> >>
>> >> I'd like to turn this list into an Erlang function that I
>> can query---
>> >>
>> >> stopwords() ->
>> >> ["word1", "word2" ... "wordN"].
>> >>
>> >> is_stopword(Word) ->
>> >> List = stopwords(),
>> >> lists_member(Word, List).
>> > Even if there is some arcane reason why you want the
>> collection of words
>> > as a list, I strongly suggest generating
>> >
>> > is_stopword("a") -> true;
>> > is_stopword("about") -> true;
>> > ...
>> > is_stopword("yourselves") -> true;
>> > is_stopword(_) -> false.
>> >
>> > Open the list of stopwords in vi.
>> > :1,$s/^.*$/is_stopword("&") -> true;/
>> > :$a
>> > is_stopword(_) -> false.
>> > <ESC>
>> >
>> > The Erlang compiler will turn this into a trie, roughly
>> speaking.
>> > This will be *dizzyingly* faster than the code you outlined.
>> >
>> >
>> >
>> >
>> >>
>> >> All my efforts so far have evolved into ugly kludges.
>> Seems to me there must be an elegant method that I'm overlooking.
>> >>
>> >> Some kind soul point the way?
>> >>
>> >> Many thanks,
>> >>
>> >> LRP
>> >>
>> >> *********************************************
>> >> My books:
>> >>
>> >> THE GOSPEL OF ASHES
>> >> http://thegospelofashes.com <http://thegospelofashes.com/>
>> >>
>> >> Strength is not enough. Do they have the courage
>> >> and the cunning? Can they survive long enough to
>> >> save the lives of millions?
>> >>
>> >> FREEIN' PANCHO
>> >> http://freeinpancho.com <http://freeinpancho.com/>
>> >>
>> >> A community of misfits help a troubled boy find his way
>> >>
>> >> AYA TAKEO
>> >> http://ayatakeo.com <http://ayatakeo.com/>
>> >>
>> >> Star-crossed love, war and power in an alternative
>> >> universe
>> >>
>> >> Available through Amazon or by request from your
>> >> favorite bookstore
>> >>
>> >>
>> >> **********************************************
>> >>
>> >> _______________________________________________
>> >> erlang-questions mailing list
>> >> erlang-questions@REDACTED
>> <mailto:erlang-questions@REDACTED>
>> >> http://erlang.org/mailman/listinfo/erlang-questions
>> >
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED <mailto:erlang-questions@REDACTED>
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED <mailto:erlang-questions@REDACTED>
>> http://erlang.org/mailman/listinfo/erlang-questions
>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20160331/a4d1c592/attachment.htm>
More information about the erlang-questions
mailing list