binaries vs lists

Serge Aleynikov serge@REDACTED
Tue Nov 11 13:44:29 CET 2003


I was wonderting if someone could contribute a thought to the following 
question regarding efficiency.

I have an Erlang TCP client that processes a binary stream which needs 
to be post-processed by removing escaped bytes.  Let's say, that byte 
16$FF is escaped as <<16#FE, 16$01>>, and the 16#FF value is used as a 
message separator.  The variable hex messages are within 512 bytes each.

What would be faster for Erlang:

1. Declare a socket to return a *list* of bytes, and do something like:

unescape([], Msg)             -> {lists:reverse(Msg), []};
unescape([16#FE, 1 | T], Msg) -> unescape(T, [16#FF | Msg]);
unescape([16#FF | Bytes]=NextMsg, Msg) ->
     {lists:reverse(Msg), NextMsg};
unescape([H | T], Msg)        -> unescape(T, [H | Msg]);

2. Alternatively, declare a socket to return *binaries*, and do:

unescape(<<>>, Msg) -> {binary_to_list(Msg), <<>>};
unescape(<<16#FE, 1, T>>, Msg) ->
     unescape(T, concat_binary([Msg, 16#FF]));
unescape(<<16#FF, Bytes>> = NextMsg, Msg) ->
     {binary_to_list(Msg), NextMsg};
unescape(<<H, T>>, Msg) ->
     unescape(T, concat_binary([Msg, H]));

Intuitively I think that the binary approach should work faster, but I 
want to make sure that concat_binary is not expensive to do for every 
byte in a stream.



More information about the erlang-questions mailing list