[erlang-questions] How does a match context of an erlang bitstring work?

ami m3oucat@REDACTED
Thu Jun 12 16:47:07 CEST 2014


Hi all

I'd read [efficiency guide][1] and [erlang-questions mailing list archive][2] &
all of the available books in erlang. But I haven't found the precise description of efficient
binaries pattern matching. Though, I haven't read sources yet :) But I hope
that people, who already have read them, would read this post. Here are my questions.


 1. How many match contexts does an erlang binary have?

 a) if we match parts of a binary sequentially and just once

        A = <<1,2,3,4>>.
        <<A1,A2,A3,A4>> = A.

 Do we have just one binary match context(moving from the beginning of A to the end), or four?

 b) if we match parts of a binary sequentially from the beginning to the end for the first time
    and(sequentially again) from the beginning to the end for the second time

        B = <<1,2,3,4>>.
        <<B1,B2,B3,B4>> = B.
        <<B11,B22,B33,B44>> = B.

  Do we have just a single match context, which is moving from the beginning of B to the end
  of B and then moving again from the beginning of B to the end of B,

  or

  we have 2 match contexts, one is moving from the beginning of B to the end of B,
  and another - again from the beginning of B to the end of B (as first can't move
  to the beginning again)

  or we have 8 match contexts?


 2. According to [documentation][3], if I write:

        my_binary_to_list(<<H,T/binary>>) ->
            [H|my_binary_to_list(T)];
        my_binary_to_list(<<>>) -> [].

 there will be only 1 match context for the whole recursion tree, even though, this
 function isn't tail-recursive.

 a) Did I get it right, that there would be only 1 match context in this case?

 b) Did I get right, that if I match an erlang binary sequentially(from the beginning to 
 the end), it doesn't matter which recursion type(tail or body) has to be used?(from the binary-matching efficiency point of view)

 c) What if I'm going to process erlang binary NOT sequentially, say, I'm travelling through
   a binary - first I match first byte, then 1000th, then 5th, then 10001th, then 10th...

   In this case,
   
   d1) If I used body-recursion, how many matching contexts for this binary would I have - 
   one or >1?
   
   d2) if I used tail-recursion, how many matching contexts for this binary would I have - 
   one or >1?


 3. If I pass a large binary(say 1 megabyte) via tail recursion, Will all the 1 megabyte data be copied? Or only a some kind of pointer to the beginning of this binary is being passed between calls?

 4. Does it matter which binary I'm matching - big or small - a match context will be created for binary of any size or only for large ones?
 

  [1]: http://www.erlang.org/doc/efficiency_guide/users_guide.html
  [2]: http://erlang.org/pipermail/erlang-questions/
  [3]: http://www.erlang.org/doc/efficiency_guide/binaryhandling.html#id65745


Thanks a lot in advance  & kind regards,
Alex


More information about the erlang-questions mailing list