[erlang-questions] Not an Erlang fan

Bjorn Gustavsson bjorn@REDACTED
Mon Sep 24 11:46:33 CEST 2007


I usually don't write about thing we haven't released yet,
but in this case I think I'll need to say a few words about
binaries in R12B.

Although, Thomas's advice is very good for R11B, it could lead to
worse performance in R12B. So if you need the very best performance
for R11B right now, do as Thomas suggest, but please forget most of
it when R12B is released! :-)

Thomas Lindgren <thomasl_erlang@REDACTED> writes:

> Second, iterating over the binary that way is a common
> mistake. When you write <<C, Rest/binary>>, you also
> build a new binary Rest (which is 'just' a couple of
> pointers, perhaps 3 words). But that is actually
> _more_ memory consuming than turning the binary into a
> list (2 words per element). So perhaps simply turning
> the binary into a list (all at once or in stages) and
> processing that would be the faster option.

That is entirely correct in R11B.

In R12B, the compiler in many cases will optimize away the building
of that new binary (what we call a "sub binary").

> 
> One way of speeding things up while keeping the same
> program structure is to get more than one byte at a
> time, e.g:
> 
> loop(<<C0,C1,C2,C3,C4,C5,C6,C7, Rest/binary>>) ->
>    ..., loop(Rest);
> loop(Small_bin) -> ...

This code is fine also in R12B. (But matching out only one
element is also quite fast, as no sub binary will be built.)

> A less pleasant, but more efficient, way of accessing
> each element of a binary is this:

Fortunately, the unpleasant way of accessing a binary is the slowest
way to access a binary in R12B (because none of the new optimizations
can be applied to it), so you should forget it as soon as R12B is
released:

> count_bin(Bin) ->
>     M = 0,
>     N = size(Bin),
>     count(M, N, Bin, 0).
> 
> count(M, N, Bin, Size) ->
>     case Bin of
> 	<<_:M/binary, C, _/binary>> ->
> 	    count(M+1, N, Bin, Size+1);
> 	_ ->
> 	    Size
>     end.
> 
> (use it like
>    {ok, B} = file:read_file("o10k.txt"),
>    count_bin(B).
> )
> 

/Bjorn
-- 
Björn Gustavsson, Erlang/OTP, Ericsson AB



More information about the erlang-questions mailing list