[erlang-questions] Hidden binaries

Wed May 28 12:35:38 CEST 2014

On Wed, May 28, 2014 at 11:42 AM, Jesper Louis Andersen
<jesper.louis.andersen@REDACTED> wrote:
>
> On Wed, May 28, 2014 at 11:40 AM, Loïc Hoguin <essen@REDACTED> wrote:
>>
>> Actually, and correct me if I'm wrong, but the sub binary optimization
>> breaks when you stick the identifier somewhere, so it ends up being copied
>> automatically.
>
>
> This happens if you send the binary in a message to another process. But
> AFAIK it won't happen if storing the binary in ETS.
>

I think there is some confusion about here about the different
binary optimizations.

Sub-binaries are created when a binary is matched using bit
syntax or split_binary/2. There will never be any automatic
conversion from a sub binary to a copy of binary data referenced
by the sub-binary.

There is an compiler-based optimization that will delay
creation of a sub-binary to optimize loops that do
binary matching. Instead of creating a new sub-binary
for every iteration of the loop, the internal match state
structure used for binary matching is kept in the loop.
As soon as the loop is exited, the match state will be
converted to a real sub-binary.

There is another optimization when appending to
binaries. When a binary is appended to, i.e. if
you write

  <<Bin/binary, ...>>

it is assumed that the program will append to
the resulting binary again and will therefore allocate
empty space at the end of the binary. If the binary
is sent to another process or stored in an ETS table,
the extra space is deallocated and the binary is
no longer marked as appendable. That optimization
is done entirely in the run-time system.

/Bjorn

-- 
Björn Gustavsson, Erlang/OTP, Ericsson AB