binaries vs lists

Per Gustafsson per.gustafsson@REDACTED
Tue Nov 11 16:57:05 CET 2003



On Tue, 11 Nov 2003, Serge Aleynikov wrote:

> Hi!
>
> I was wonderting if someone could contribute a thought to the following
> question regarding efficiency.
>
> I have an Erlang TCP client that processes a binary stream which needs
> to be post-processed by removing escaped bytes.  Let's say, that byte
> 16$FF is escaped as <<16#FE, 16$01>>, and the 16#FF value is used as a
> message separator.  The variable hex messages are within 512 bytes each.
>
> What would be faster for Erlang:
>
> 1. Declare a socket to return a *list* of bytes, and do something like:
>
> unescape([], Msg)             -> {lists:reverse(Msg), []};
> unescape([16#FE, 1 | T], Msg) -> unescape(T, [16#FF | Msg]);
> unescape([16#FF | Bytes]=NextMsg, Msg) ->
>      {lists:reverse(Msg), NextMsg};
> unescape([H | T], Msg)        -> unescape(T, [H | Msg]);
>
> 2. Alternatively, declare a socket to return *binaries*, and do:
>
> unescape(<<>>, Msg) -> {binary_to_list(Msg), <<>>};
> unescape(<<16#FE, 1, T>>, Msg) ->
>      unescape(T, concat_binary([Msg, 16#FF]));
> unescape(<<16#FF, Bytes>> = NextMsg, Msg) ->
>      {binary_to_list(Msg), NextMsg};
> unescape(<<H, T>>, Msg) ->
>      unescape(T, concat_binary([Msg, H]));
>
> Intuitively I think that the binary approach should work faster, but I
> want to make sure that concat_binary is not expensive to do for every
> byte in a stream.
>
> Thanks.
>
> Serge
>
>
>

I think it would be reasonable to do something like this:

 unescape(<<>>, Msg) -> {lists:reverse(Msg), <<>>};
 unescape(<<16#FE, 1, T/binary>>, Msg) ->
      unescape(T, [16#FF|Msg]));
 unescape(<<16#FF, _Bytes/binary>> = NextMsg, Msg) ->
      {lists:reverse(Msg), NextMsg};
 unescape(<<H, T/binary>>, Msg) ->
      unescape(T, [H | Msg]));

Because concat_binary would require that the binary is copied each time
something is concatenated to it and since the result of the operation
should be a list it is reasonable to build this list directly instead of
first constructing a binary  (Which is costly) and then turning it into a
list.

/Per




More information about the erlang-questions mailing list