[erlang-questions] binary optimization

Sat Jul 18 14:54:28 CEST 2009

Joel Reymont wrote:
> What about this one? I'm calling session:subscribe/2 and that's known at
> compile time.
> 
> src/transport.erl:98: Warning: NOT OPTIMIZED: sub binary used by
> session:subscribe/2
> 
> 98: handle_info({tcp, Sock, <<_:96, ?SUBSCRIBE, Topic/binary>>},
> State) ->
>     inet:setopts(Sock, [{active, once}]),
>     session:subscribe(State#state.session, Topic),
>     {noreply, State};

Well, no. Because you could dynamically replace the session module at
any time, so the compiler can't really make any assumptions about what
that function does (or will do after a future code upgrade).

You should probably not dwell too much on these warnings. They are
hints to allow you to optimize what you're doing with binaries,
but those optimizations can only happen in particular (though
important) circumstances. Your code does not seem to be an example
of those circumstances.

Though I haven't taken a closer look at these binary optimizations
myself, this is my understanding of it:

 1) When you match out a subsection of a binary, the system may use a
    small (a few words) representation of this subsection in the form
    of a "sub-binary", which is simply a reference that says "I'm this
    part of that guy over there". The sub-binary will be allocated on
    the heap, just like any other data, and to the user it looks like
    you made a copy of a part of the original binary. If the part is
    quite small, the system might choose to copy the data to a new
    proper binary instead of making a sub-binary. So far so good.

 2) Depending on what you're going to do with that sub-binary, it might
    be a waste of time to create it on the heap, if it's not going to
    live for very long (and in particular if this is in the middle of
    a loop over a  binary). The "delayed sub-binary optimization" is
    when the compiler decides to keep the sub-binary info in registers
    or on the stack instead. If the functions makes a tail-recursive
    call to itself with the sub-binary as one of the arguments, the
    compiler can detect that it still doesn't need to actually create
    a heap object, but just loop with the info still kept in registers
    and/or on the stack.

 3) This means that if what you're doing is to put the sub-binary in a
    data structure, this optimization is off. The same goes if you are
    passing it to an unknown function (or perhaps even to _any_ other
    function except for tail calls to the current function; this is
    where Mikael and I aren't sure).

 4) Unless you _are_ doing some kind of loop over the contents of a
    binary, these optimizations are rather negligible, but when you
    _do_ need to traverse a binary, they can give a good speedup.

  /Richard