Streaming Input
Håkan Stenholm
hakan.stenholm@REDACTED
Mon Feb 28 00:05:11 CET 2005
orbitz wrote:
> I am working with a protocol where the size of the following block is
> told to me so I can just convert the next N bytes to, say, a string.
> The problem is though, I'm trying to write this so it handles a stream
> properly, so in the binary I have could be all N bytes that I need, or
> something less than N. So at first I tried:
>
> extract_string(Tail, 0, Res) ->
> {ok, {string, Res}, Tail};
> extract_string(<<H, Tail/binary>>, Length, Res) ->
> extract_string(Tail, Length - 1, lists:append(Res, [H]));
> extract_string(<<>>, Length, Res) ->
> case dispatch_message() of
> {decode, _, Data} ->
> extract_string(Data, Length, Res)
> end.
extract_string(Tail, 0, Res) ->
{ok, {string, lists:reverse(Res)}, Tail}; %% when done reverse
list back to intended order
extract_string(<<H, Tail/binary>>, Length, Res) ->
extract_string(Tail, Length - 1, [H | Res]); %% turn O(N) operation
into O(1) op.
extract_string(<<>>, Length, Res) ->
case dispatch_message() of
{decode, _, Data} ->
extract_string(Data, Length, Res)
end.
This version will be much faster than the original version, because
appending elements to the end of a list is a O(N) operation which is
done N times (O(N2)) - instead append to front of list (O(1) operation)
and reverse the list when your done with accumulating the (Res) list
(O(N)).
>
> When the binary is empty but I still need more data it waits for
> more. I don't know if this is the proper idiom (it seems gross to me
> but I am unsure of how to do it otherwise). This is incredibly slow
> though. With a long string that I need to extract it takes a lot of
> CPU and far too long. So I decided to do:
>
> extract_string(Data, Length, _) ->
> <<String:Length/binary, Tail/binary>> = Data,
> {ok, {string, binary_to_list(String)}, Tail}.
You probably want something like this:
extract_string(Data, Length, _) ->
DataLength = size(Data), %% get length of Data
L = case DataLength >= Length of true -> Length;
false -> DataLength
end,
<<String:L/binary, Tail/binary>> = Data,
{ok, {string, binary_to_list(String)}, Tail}.
This should be able to extract as much data as possible in a single
binary access - this should be slightly faster than my pervious
extract_string/3 update above.
>
> In terms of CPU and time this is much much better, but if I don't have
> all N bytes it won't work. Any suggestions?
>
More information about the erlang-questions
mailing list