[erlang-questions] Speeding up string matching

Bjorn Gustavsson bjorn@REDACTED
Fri Sep 22 09:10:32 CEST 2006


I know that it isn't obvious, but the following version of your
function should be faster:

break_on_nl1(B) -> break_on_nl1(0, B).
break_on_nl1(Len, Bin) when Len =:= size(Bin) ->
	{Bin, <<>>};
break_on_nl1(Len, Bin) ->
    case Bin of
	<<_:Len/binary, $\n, _/binary>> ->
	    <<Msg:Len/binary, _, Tail/binary>> = Bin,
	    {Msg, Tail};
	_ ->
	    break_on_nl1(Len+1, Bin)
    end.

I haven't done any measurements on execution times, but I know
that the revised version avoids creating two sub-binaries, which
should make it faster.

We hope to be able to add better optimization to a future version
of the Erlang compiler, so that you will not have to write such
contrived code to get the fastest possible bit syntax matching.

/Bjorn

Gaspar Chilingarov <nm@REDACTED> writes:

> Hi all!
> 
> I wish to share an experience which I had today when writing code.
> 
[...]
> 
> break_on_nl1(B) -> break_on_nl1(0, B).
> break_on_nl1(Len, Bin) when Len == size(Bin) ->
> 	{Bin, <<>>};
> break_on_nl1(Len, Bin) ->
> 	<<Msg:Len/binary, Symb, Tail/binary>> = Bin,
> 	case Symb of
> 		$\n -> {Msg, Tail};
> 		_ -> break_on_nl1(Len+1, Bin)
> 	end.

-- 
Björn Gustavsson, Erlang/OTP, Ericsson AB




More information about the erlang-questions mailing list