[erlang-questions] list vs binary performarnce, destructuring and consing

Björn Gustavsson bgustavsson@REDACTED
Thu Oct 25 10:14:52 CEST 2012


On Wed, Oct 24, 2012 at 5:58 PM, Erik Pearson <erik@REDACTED> wrote:
[...]
> And from the section on optimizations:
>
> The basis of this optimization is that it the emulator can create bit
> strings with extra uninitialized space, so if a bit string is built by
> continuously appending to a binary the data does not need to be copied if
> there is enough uninitialized data at the end of the bit string.
>
> One point I don't quite understand here:
>
> http://www.erlang.org/doc/efficiency_guide/binaryhandling.html#match_context
>
> is is this function:
>
> my_list_to_binary(List) ->
>     my_list_to_binary(List, <<>>).
>
> my_list_to_binary([H|T], Acc) ->
>     my_list_to_binary(T, <<Acc/binary,H>>);
> my_list_to_binary([], Acc) ->
>     Acc.
>
> the first iteration in my_list_binary/2 that Acc will be copied, but it will
> not be copied after this. I don't know why any copying occurs at all, since
> it is evident that in this case the initial binary <<>> is used solely for
> the append operation in the first clause. From what I understand, it is
> multiple references that cause copying to occur upon append, since the
> append operation may result the movement of the binary in memory, and the
> other references would otherwise become invalid. I only see one reference
> threading through this function.

Appending to a binary is optimized by the run-time system (which is
stated in the Efficiency Guide) with no help from the compiler. Therefore,
the run-time system has no way of knowing that the binary created in
my_list_to_binary/1 will be appended to, so it will *not* mark the empty
binary as appendable and reserve extra space in it for appending.

>
> I would think it would be allocated in my_list_to_binary/1, passed by
> reference, have an initial size of 256, be appended to without any copying,
> and only maybe be copied if the accumulation exceeded 256 bytes (the binary
> may be moved when it is extended.) Is the implication is that the initial
> blank binary <<>> is a different type of binary, say fixed-length or static,
> and that the first append operation triggers the copy of that binary into a
> newly allocated extendable binary (with initial size twice the original or
> 256 whichever is bigger)?

Yes, correct. The initial empty binary is a different type of binary. It is
a heap binary stored in the constant pool for the module, so the
"creation" of it is extremely cheap.

/Björn

-- 
Björn Gustavsson, Erlang/OTP, Ericsson AB



More information about the erlang-questions mailing list