binaries vs lists
Per Gustafsson
per.gustafsson@REDACTED
Tue Nov 11 16:57:05 CET 2003
On Tue, 11 Nov 2003, Serge Aleynikov wrote:
> Hi!
>
> I was wonderting if someone could contribute a thought to the following
> question regarding efficiency.
>
> I have an Erlang TCP client that processes a binary stream which needs
> to be post-processed by removing escaped bytes. Let's say, that byte
> 16$FF is escaped as <<16#FE, 16$01>>, and the 16#FF value is used as a
> message separator. The variable hex messages are within 512 bytes each.
>
> What would be faster for Erlang:
>
> 1. Declare a socket to return a *list* of bytes, and do something like:
>
> unescape([], Msg) -> {lists:reverse(Msg), []};
> unescape([16#FE, 1 | T], Msg) -> unescape(T, [16#FF | Msg]);
> unescape([16#FF | Bytes]=NextMsg, Msg) ->
> {lists:reverse(Msg), NextMsg};
> unescape([H | T], Msg) -> unescape(T, [H | Msg]);
>
> 2. Alternatively, declare a socket to return *binaries*, and do:
>
> unescape(<<>>, Msg) -> {binary_to_list(Msg), <<>>};
> unescape(<<16#FE, 1, T>>, Msg) ->
> unescape(T, concat_binary([Msg, 16#FF]));
> unescape(<<16#FF, Bytes>> = NextMsg, Msg) ->
> {binary_to_list(Msg), NextMsg};
> unescape(<<H, T>>, Msg) ->
> unescape(T, concat_binary([Msg, H]));
>
> Intuitively I think that the binary approach should work faster, but I
> want to make sure that concat_binary is not expensive to do for every
> byte in a stream.
>
> Thanks.
>
> Serge
>
>
>
I think it would be reasonable to do something like this:
unescape(<<>>, Msg) -> {lists:reverse(Msg), <<>>};
unescape(<<16#FE, 1, T/binary>>, Msg) ->
unescape(T, [16#FF|Msg]));
unescape(<<16#FF, _Bytes/binary>> = NextMsg, Msg) ->
{lists:reverse(Msg), NextMsg};
unescape(<<H, T/binary>>, Msg) ->
unescape(T, [H | Msg]));
Because concat_binary would require that the binary is copied each time
something is concatenated to it and since the result of the operation
should be a list it is reasonable to build this list directly instead of
first constructing a binary (Which is costly) and then turning it into a
list.
/Per
More information about the erlang-questions
mailing list