[erlang-questions] string concatenation efficiency

Khitai Pang khitai.pang@REDACTED
Thu Feb 4 02:46:26 CET 2016


Hi Joe,

Thank you for your advice!  You are right, I should worry more about how 
the message is transported and how the receiver parses the message.  And 
maybe I shouldn't worry about trivial performance difference until I 
have a whole functioning system.


Thanks
Khitai

On 2016/2/2 0:01, Joe Armstrong wrote:
> On Sat, Jan 30, 2016 at 5:04 PM, Khitai Pang <khitai.pang@REDACTED> wrote:
>> Thank you all for your answers.
>>
>> The concatenated string is a redis key, it will be part of an erlang message
>> that will be sent to a redis client erlang process.   I tried sending the
>> iolist ["Item:{",ItemId,"}"], and it works fine. I think this is the most
>> efficient way.
> It *is* the most efficient way - but remember what you sent, some other program
> has to receive - the amount of work done in transporting the message
> and delivering it to the receiver will far outweigh the cost of building the
> message prior to sending it.
>
> The rule of thumb is that parsing is *always* slower than serialising
> messages - why? in parsing you must take a decision at every byte. In
> serializing a message
> you take a decision at every node in a parse-tree.
>
> So you should not be worrying about what you send - but you should worry about
> how it will be received.
>
> Building the message from variables into an I/O list or with "++"
> has a trivial cost compared to all the machine cycles taken up
> transporting the message.
>
> Even measuring the cost of different methods won't tell you much.
> You'll find out that A (IO-lists) is faster or slower than B (++) but all the
> real time is taken up in X and Y (which you didn't know about)
>
> People happily send KBytes encrypted data in XML, when single
> unencrypted bytes might have sufficed.
>
> The only thing you should worry about is end-to-end efficiency.
>
> So:
>
>    1) Make everything
>    2) Measure
>    3) Fast enough?
>        Yes
>             Happy
>        No
>            Can I wait ten years?
>            No
>               measure to find bit that takes the most time
>               fix worse bit goto 2)
>           Yes
>               wait ten years - go to 2)
>
> Patience is a virtue - the programs I wrote 20 years ago now run
> thousands of times faster - not because I'm a better programmer but
> because we have
> GHz clocks now (they were MHz 20 years ago)
>
> You want to write clear short, elegant code that will still run in 20 years time
>
> Given the choice between clear code that is slow but fast-enough and
> fast code that is unclear I always choose the clear clode.
>
> The key point is fast-enough.
>
> Cheers
>
> /Joe
>
>>
>> Thanks
>> Khitai
>>
>>
>> On 2016/1/29 9:24, Richard A. O'Keefe wrote:
>>>
>>> On 28/01/16 8:22 pm, Khitai Pang wrote:
>>>> For string concatenation, which one of the following is the most
>>>> efficient?
>>>>
>>>> 1)
>>>> strings:join(["Item:{", ItemID, "}")], ""]
>>>>
>>>> 2)
>>>> lists:append(["Item:{", ItemID, "}")])
>>>>
>>>> 3)
>>>> "Item:{" ++ ItemID ++ "}"
>>>>
>>>> Here ItemID is a UUID.
>>> For something this size, it hardly matters.
>>>
>>> The definition of lists:append/1 is
>>>     append([E])     -> E;
>>>     append([H|T]) -> H ++ append(T);
>>>     append([])      -> [].
>>> so lists:append([A,B,C]) -> A ++ lists:append([B,C])
>>>                                      -> A ++ B ++ lists:append([C])
>>>                                      -> A ++ B ++ C.
>>> Clearly, there is no way for this to be more efficient than
>>> writing A ++ B ++ C directly.
>>>
>>> The definition of string:join/2 is
>>>     join([], Sep) when is_list(Sep) -> [];
>>>     join([H|T], Sep)                      ->    H ++ lists:append([Sep ++ X
>>> || X <- T]).
>>> -- at least in 18.1 -- which rather startled me because I was expecting
>>>     join([E], Sep) when is_list(Sep) -> E;
>>>     join([H|T], Sep) -> H ++ Sep ++ join(T, Sep);
>>>     join([], Sep) when is_list(Sep) -> [].
>>> Clearly, there is no way that either version of join/2 can be faster than
>>> append/1.  In fact
>>>   append/1 is measurably faster than
>>>   the revised join/2, which is measurably faster than
>>>   the original join/2,
>>> but you have to push the sizes ridiculously high to make these
>>> measurements.
>>>
>>> Really, I suggest you write whichever makes your intentions clearest
>>> to human readers, get the program going, then measure it.  I would
>>> be surprised if this particular issue made any significant difference.
>>>
>>> But why are you using strings at all?
>>> Strings are an *interface* data type; you process them when you receive
>>> data from an external source and you generate them (or rather you
>>> generate iolists) when you are sending data to an external source, but
>>> for internal processing you usually want some sort of tree.
>>>
>>> Seeing "Item:{$ItemID}" suggests that maybe you *are* generating output
>>> for some external process, but in that case, perhaps you should just be
>>> sending along the iolist ["Item:{",ItemId,"}"] and *not* concatenating the
>>> pieces.
>>>
>>>
>>>
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions




More information about the erlang-questions mailing list