String search in a binary

Magnus Thoäng magnus.thoang@REDACTED
Wed Dec 1 15:56:32 CET 2004


Carsten Schultz wrote:
...
> I have the following piece of code:
> 
> extract_contents(Bin) when binary(Bin) ->
>     Str = binary_to_list(Bin),
>     Start = string:str(Str, "<a name=\"datei")-1,
>     End = string:rstr(Str, "</body>"),
>     Len = End-Start-1,
>     <<_:Start/binary, C:Len/binary, _/binary>> = Bin,
>     C.
> 
> The binary_to_list is bothering me.  Is there a short and efficient
> version doing the same?
...

IIRC, the string:str function is a brute-force text search, so you don't 
gain any performance by using it.

You might do something like...

extract_contents(Bin) when binary(Bin) ->
     extract_contents(Bin, 0).

extract_contents(Bin, Offset) ->
     case Bin of
         <<_:Offset/binary,"<a name=\"datei", Rest/binary>> ->
             extract_body(Rest,0);
         _ ->
             extract_contents(Bin, Offset + 1)
     end.

extract_body(Bin, Offset) ->
     case Bin of
         <<Body:Offset/binary,"</body>",_/binary>> ->
             Body;
         _ ->
             extract_body(Bin, Offset + 1)
     end.

...but since you are not parameterizing the start and end strings, you 
could probably hard-code some clever advancing of the Offset when not 
hitting the strings.

The reason for using case clauses instead of matching with function 
clauses is simply that (for unknown reasons) the syntax is not allowed 
in the function head.

-- 
Magnus Thoäng




More information about the erlang-questions mailing list