[erlang-questions] join/2 function
Serge Aleynikov
serge@REDACTED
Mon Dec 18 16:04:33 CET 2006
Here's another version 't5' that concatinates elements using stack
instead of an accumulator:
1> implode:t5(implode:l1(), ",").
"abc,def,ghi,jkl,mno,pqr,stu,vxyz,123456789,987654321"
2> implode:test(1000, implode:l1()).
do1(1000,L,"."): 7176us
do2(1000,L,"."): 7376us
do3(1000,L,"."): 3225us
do4(1000,L,"."): 4608us
do5(1000,L,"."): 2620us
ok
3> implode:test(1000, implode:l2()).
do1(1000,L,"."): 447673us
do2(1000,L,"."): 185972us
do3(1000,L,"."): 61102us
do4(1000,L,"."): 46755us
do5(1000,L,"."): 39452us
ok
4> implode:test(10000, implode:l2()).
do1(10000,L,"."): 4840698us
do2(10000,L,"."): 1861653us
do3(10000,L,"."): 613191us
do4(10000,L,"."): 472865us
do5(10000,L,"."): 393750us
Regards,
Serge
Ulf Wiger (TN/EAB) wrote:
> Actually, if you increase the size of the data set,
> you will get slightly different results (t3 is still
> the cheapest, though):
>
> 1> L = lists:duplicate(50,lists:duplicate(20,$a)).
> ["aaaaaaaaaaaaaaaaaaaa",
> "aaaaaaaaaaaaaaaaaaaa"|...]
> 2> timer:tc(implode,t1,[L1,"."]).
> {832,
> "aaaaaaaaaaaaaaaaaaaa.aaaaaaaaaaaaaaaaaaaa.|..."}
> 3> timer:tc(implode,t2,[L1,"."]).
> {185,
> "aaaaaaaaaaaaaaaaaaaa.aaaaaaaaaaaaaaaaaaaa.|..."}
> 4> timer:tc(implode,t3,[L1,"."]).
> {49,
> "aaaaaaaaaaaaaaaaaaaa.aaaaaaaaaaaaaaaaaaaa.|..."}
> 5> timer:tc(implode,t4,[L1,"."]).
> {63,
> "aaaaaaaaaaaaaaaaaaaa.aaaaaaaaaaaaaaaaaaaa.|..."}
>
> where
>
> t4([H|T], Sep) ->
> H ++ lists:concat([Sep ++ X || X <- T]).
>
> t3 is consistently better than t4(*). The reason
> why both of them are better than the others is
> that they reduce the number of calls to ++
> (lists:concat() uses ++, but flatten() doesn't)
>
> (*) Except if L is a short list of very long substrings,
> e.g. lists:duplicate(3,lists:duplicate(1000,$a)).
>
> The nice thing about t3 is obviously that it always
> puts the (growing) accumulator last in the append.
> The left hand side of ++ is always copied, which is
> why t1 is so bad. The t2 version using flatten()
> gets badly beaten for two reasons: (1) flatten
> always rebuilds the whole list, while ++ doesn't
> have to rebuild the RHS, and (2) it is implemented
> purely in erlang, while ++ is a BIF.
>
>
> BUT the big problem with running tests this way is
> that we don't get a fair picture of the GC component.
> Let's run a number of iterations for each:
>
> do1(N, L,S) when N > 0 ->
> t1(L,S),
> do1(N-1,L,S);
> do1(_,_,_) ->
> erlang:garbage_collect(),
> ok.
> ....
>
> do4(N, L,S) when N > 0 ->
> t4(L,S),
> do4(N-1,L,S);
> do4(_,_,_) ->
> erlang:garbage_collect(),
> ok.
>
> Using L = ["abc","def","ghi","jkl","mno","pqr","stu
> ","vxyz","123456789","987654321"]
>
> do1(1000,L,"."): 7428 us
> do2(1000,L,"."): 12519 us
> do3(1000,L,"."): 5370 us
> do4(1000,L,"."): 7388 us
>
> Using L2 = lists:duplicate(50,lists:duplicate(50,$a)).
>
> do1(1000,L2,"."): 1353298 us
> do2(1000,L2,"."): 417940 us
> do3(1000,L2,"."): 121037 us
> do4(1000,L2,"."): 72817 us
>
> Here, t2 makes a comeback since it at least
> avoids appending in the inner loop.
>
> Conclusion: benchmarking is tricky. (:
>
> BR,
> Ulf W
>
>> -----Original Message-----
>> From: erlang-questions-bounces@REDACTED
>> [mailto:erlang-questions-bounces@REDACTED] On Behalf Of
>> Torbjorn Tornkvist
>> Sent: den 18 december 2006 09:34
>> To: erlang-questions@REDACTED
>> Subject: Re: [erlang-questions] join/2 function
>>
>>
>> This was discussed recently on the #erlang channel.
>> My take on this was:
>>
>>
>> t3(List, Sep) ->
>> lists:foldr(fun(X,[]) -> X; (X,Acc) -> X++Sep++Acc end,
>> "", List).
>>
>>
>> See the attached file for all three versions.
>>
>> 3>
>> timer:tc(implode,t1,[["abc","def","ghi","jkl","mno","pqr","stu
>> ","vxyz","123456789","987654321"],
>> "."]).
>> {7,"abc.def.ghi.jkl.mno.pqr.stu.vxyz.123456789.987654321"}
>> 4>
>> timer:tc(implode,t1,[["abc","def","ghi","jkl","mno","pqr","stu
>> ","vxyz","123456789","987654321"],
>> "."]).
>> {7,"abc.def.ghi.jkl.mno.pqr.stu.vxyz.123456789.987654321"}
>> 5>
>> timer:tc(implode,t2,[["abc","def","ghi","jkl","mno","pqr","stu
>> ","vxyz","123456789","987654321"],
>> "."]).
>> {10,"abc.def.ghi.jkl.mno.pqr.stu.vxyz.123456789.987654321"}
>> 6>
>> timer:tc(implode,t2,[["abc","def","ghi","jkl","mno","pqr","stu
>> ","vxyz","123456789","987654321"],
>> "."]).
>> {10,"abc.def.ghi.jkl.mno.pqr.stu.vxyz.123456789.987654321"}
>> 7>
>> timer:tc(implode,t3,[["abc","def","ghi","jkl","mno","pqr","stu
>> ","vxyz","123456789","987654321"],
>> "."]).
>> {5,"abc.def.ghi.jkl.mno.pqr.stu.vxyz.123456789.987654321"}
>> 8>
>> timer:tc(implode,t3,[["abc","def","ghi","jkl","mno","pqr","stu
>> ","vxyz","123456789","987654321"],
>> "."]).
>> {5,"abc.def.ghi.jkl.mno.pqr.stu.vxyz.123456789.987654321"}
>>
>>
>> --Tobbe
>>
>>
>> Matthias Lang wrote:
>>> Hoan Ton-That writes:
>>>
>>> > I think that the following function join should
>>> > be included in the lists module.
>>> >
>>> > join(Sep, List) ->
>>> > lists:foldl(fun(A, "") -> A; (A, Acc) -> Acc ++ Sep ++
>> A end, "", List).
>>> Your implementation of join is quadratic in the length of the list.
>>>
>>> Here's one O(N) way to do it:
>>>
>>> lcjoin(Sep, [H|T]) ->
>>> Joined = [ [Sep|X] || X <- T ],
>>> lists:flatten([H|Joined]).
>>>
>>> In many cases the lists:flatten/1 call is unnecessary.
>>>
>>> Matthias
>>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: implode.erl
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20061218/02bc42be/attachment.ksh>
More information about the erlang-questions
mailing list