[erlang-questions] clarify: why bit syntax so slow + benchmark code

Sat Nov 17 11:22:57 CET 2007

Mateusz Berezecki wrote:
>
> On Nov 16, 2007, at 4:09 PM, Per Gustafsson wrote:
>
>>>
>>
>> It is slow because for each iteration a two sub-binary structure 
>> (about 5 words) is allocated. This will be optimized in the next 
>> release of Erlang/OTP, but your code is sub optimal anyway, if you 
>> want good performance you should write:
>>
>> test1(<<>>) -> done;
>> test1(<<_A, Rest/binary>>) ->
>>       test1(Rest).
>>
>> That is, you should not first pattern match and create a sub-binary 
>> and then match against that one.
>
> It was written that way to show that some calculations are done on that
> matched value later.  Generally speaking
> I'm having a trouble parsing a stream of data
> where each control frame is single byte and I have no knowledge of
> the stream unless I parse that single byte (i.e. and ~3-4 bytes after 
> it - they vary depending
> on that first single byte).
>
> I'm curious, can erlang achieve good performance with that kind of 
> stream data filtering?
>
>
> Mateusz Berezecki
>
The way to write that would be:

parse_stream(<<1,some parsing pattern,Rest/binary>>) ->
  ...
  parse_stream(Rest);
parse_stream(<<2,some other parsing pattern,Rest/binary>>) ->
  ...
  parse_stream(Rest);
...
parse_stream(<<>>) -> ok.

This will minimize he construction of unnecessary sub-binaries and in 
R12B you should be able to get good performance for this kind of 
approach (It still wouldn't be as fast as C, but a lot closer).

If you want some more information about the implementation of the bit 
syntax you should read the paper I presented at this years EUC:

http://www.erlang.se/euc/07/papers/1700Gustafsson.pdf

Per