[erlang-questions] string concatenation efficiency

Jesper Louis Andersen jesper.louis.andersen@REDACTED
Thu Jan 28 13:53:30 CET 2016


On Thu, Jan 28, 2016 at 8:22 AM, Khitai Pang <khitai.pang@REDACTED>
wrote:

> For string concatenation, which one of the following is the most efficient?


Ask eministat, but note that the A ++ B ++ C is probably recognized by the
compiler as dead code, so we might be checking a loop here. Another note is
that a single invocation has no measurable difference, so unless you are
doing millions of these, it doesn't matter.

Code and output follows. The ++ variant is fastest followed by the
lists:append followed by the string:concat. But there is a lot of outlier
variance in these measurements which suggest it is highly affected by
outside factors. Don't put too much into these numbers and their precision
as there is a lot of variance in runtime.

-----
-module(concat).

-export([t/0, datasets/0]).

c1(Items) -> c1(Items, 10000).

c1(_, 0) -> ok;
c1(Items, K) ->
    string:join(Items, ""),
    c1(Items, K-1).

c2(Items) -> c2(Items, 10000).

c2(_, 0) -> ok;
c2(Items, K) ->
    lists:append(Items),
    c2(Items, K-1).

c3(Items) -> c3(Items, 10000).

c3(_, 0) -> ok;
c3([A,B,C] = Is, K) ->
    _ = A ++ B ++ C,
    c3(Is, K-1).

datasets() ->
    ItemID = "123134-123-12313-1--123-1231-",
    Input = ["Item:{", ItemID, "}"],
    [eministat:s("++",
                 fun() ->
                         c3(Input)
                 end, 50),
     eministat:s("strings:join/1",
                 fun() ->
                         c1(Input)
                 end, 50),
     eministat:s("lists:append",
                 fun() ->
                         c2(Input)
                 end, 50)].

t() ->
    [H | T] = datasets(),
    eministat:x(95.0, H, T).
--------------

18> concat:t().
x ++
+ strings:join/1
* lists:append
+--------------------------------------------------------------------------+
|x xx   x  xxxxx+++****x   *+ ***x +** + +*+ ++ +++++++++++  xx      +    +|
|       x  xx xxxx*** *     x ****  *      + +  + +   ++++                 |
|       x  x  xxxx*** *       ****           +        ++++                 |
|       x     xxx ***         ***            +        ++++                 |
|       x     xxx * *         ***            +         +++                 |
|       x      x  *             *                      +++                 |
|              x  *                                    + +                 |
|              x  *                                    + +                 |
|              x  *                                      +                 |
|    |_________MA__________|                                               |
|                                             |______A_M____|              |
|                  |______AM_____|                                         |
+--------------------------------------------------------------------------+
------

Dataset: x N=50 CI=95.0000
Statistic     Value     [         Bias] (Bootstrapped LB‥UB)
Min:            1404.00
1st Qu.         1730.00
Median:         1803.00
3rd Qu.         1844.00
Max:            3180.00
Average:        1844.26 [   6.50820e-2] (      1778.54 ‥       1967.80)
Std. Dev:       320.492 [     -13.6930] (      177.997 ‥       502.461)

Outliers: 3/6 = 9 (μ=1844.33, σ=306.799)
        Outlier variance:      0.852347 (severe, the data set is probably
unusable)

------

Dataset: + N=50 CI=95.0000
Statistic     Value     [         Bias] (Bootstrapped LB‥UB)
Min:            2339.00
1st Qu.         2779.00
Median:         2965.00
3rd Qu.         3017.00
Max:            3523.00
Average:        2903.64 [  -1.28160e-2] (      2847.28 ‥       2959.10)
Std. Dev:       200.957 [     -3.90939] (      159.507 ‥       269.636)

Outliers: 1/1 = 2 (μ=2903.63, σ=197.048)
        Outlier variance:      0.483689 (moderate)

Difference at 95.0% confidence
        1059.38 ± 106.139
        57.4420% ± 5.75510%
        (Student's t, pooled s = 267.487)
------

Dataset: * N=50 CI=95.0000
Statistic     Value     [         Bias] (Bootstrapped LB‥UB)
Min:            1841.00
1st Qu.         1927.00
Median:         2158.00
3rd Qu.         2300.00
Max:            2586.00
Average:        2125.90 [    -0.186946] (      2070.88 ‥       2184.36)
Std. Dev:       206.241 [     -2.58143] (      189.011 ‥       239.516)

Outliers: 0/0 = 0 (μ=2125.71, σ=203.660)
        Outlier variance:      0.645536 (severe, the data set is probably
unusable)

Difference at 95.0% confidence
        281.640 ± 106.934
        15.2712% ± 5.79820%
        (Student's t, pooled s = 269.491)
------

ok
19>


-- 
J.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20160128/e59155ba/attachment.htm>


More information about the erlang-questions mailing list