[erlang-questions] Parsing binaries performance

Darren New dnew@REDACTED
Wed Jun 25 19:09:03 CEST 2008


Sebastian Dehne wrote:
> I'm trying to write a parser in Erlang for a byte-stream (which I 
> receive from the TCP socket), but I realise that my code is slow 
> compared to the java version which I have. I've attached both version.

You might want to look at the re module, the regexp module, and/or the 
string module, any of which might be faster.  Test, of course.

Another possibility would be to try code that simply indexes into the 
binary instead of breaking it apart into a new binary, as that might not 
need to copy things around as much.  Something like

check_pos(Bin, Inx, Chr) ->
   <<_:Inx/binary, MaybeCh:integer, _/binary>>,
   Chr == MaybeCh.

find_cr(Bin) -> find_cr(Bin, 0).

find_cr(Bin, Inx) when Inx >= size(Bin) -> false;
find_cr(Bin, Inx) -> check_pos(Bin, Inx, $\r)
   andalso check_pos(Bin, Inx+1, $\n).

Completely untested, not even compiled, but that's what I'd try next. I 
don't know if the compiler will wind up copying any parts of the binary 
in check_pos or not.

-- 
Darren New / San Diego, CA, USA (PST)
  Helpful housekeeping hints:
   Check your feather pillows for holes
    before putting them in the washing machine.



More information about the erlang-questions mailing list