[erlang-questions] list vs binary performarnce, destructuring and consing

Martynas Pumputis martynasp@REDACTED
Tue Oct 23 10:07:37 CEST 2012


Hi,

Could you show the exact steps of your simulation? Binary version should 
be faster, because some extra memory allocation is avoided per each 
iteration and large binaries aren't being copied.

Take a look at: 
http://www.erlang.org/doc/efficiency_guide/binaryhandling.html

Martynas

On 10/23/2012 12:18 AM, Erik Pearson wrote:
> Hi,
> I've read from advice given many years ago that processing binaries byte
> by byte (e.g. a recursive parser), performance is better using a list to
> accumulate the bytes, rather than using binary concatenation. So
> [B|Accum] rather than <<Accum/binary, B>>. There seems to be
> a consensus  however, on the efficiency of Binaries compared to List
> strings.
>
> My own quick test, which was just to copy a list or binary element by
> element, showed much better performance for the list version. The test
> was basically to pass an arbitrary string or binary, and copy it some
> number of thousands of times, and output the complete copies per second.
>
> I tried list based accumulation for a binary, using binary destructuring
> in the function head, and that sped things up, but it was still slower
> than the equivalent list string copy.
>
> Are there any tips for binaries? Of is this not a good use case for
> binaries.
>
> test_bin_copy(Bin) ->
>      test_bin_copy(Bin, <<>>).
> test_bin_copy(<<>>, Accum) ->
>      Accum;
> test_bin_copy(<<Char, Rest/binary>>, Accum) ->
>      test_bin_copy(Rest, <<Accum/binary, Char>>).
>
> test_string_copy(Bin) ->
>      test_string_copy(Bin, []).
> test_string_copy([], Accum) ->
>      lists:reverse(Accum);
> test_string_copy([Char|Rest], Accum) ->
>      test_string_copy(Rest, [Char|Accum]).
>
> For what its worth this is part of a json module. The current practice
> in json libraries seems to  favor binaries, so I assumed there were
> inherent performance advantages. I can imagine, e.g., that an empty
> binary would be stored as a modest sized buffer that would be appended
> in place until there was a need to expand it or copy (e.g. if an older
> version of it was being appended), and that operations on it would be
> fast compared to arbitrary consing (which is however highly optimized.)
>
> I think some of the favoritism for binaries in json libs is because it
> makes it easy to differentiate json strings (as erlang binaries) from
> json arrays (as erlang lists), but my implementation is using tagged
> tuples to contain each json value, so this is not a concern. Of course
> there are the memory concerns, but in my tests any memory concerns with
> list char size vs binary bytes is erased by the performance gains.
>
> I'm sure I've put my foot in my mouth at least once, but, anyway, advice
> appreciated.
>
> Thanks,
> Erik.
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>




More information about the erlang-questions mailing list