Streaming Input

Raimo Niskanen raimo@REDACTED
Mon Feb 28 09:53:29 CET 2005


Try to collect binaries until their total size is enough. Then
split the last binary in the right position, and depending on
if your application can accept your string being a list of 
binaries (an I/O list, it can e.g be sent to a port with no
problem) or not, convert the collected binaries into a 
binary (list_to_binary on the list of binaries) if you need
a binary, or convert the collected binaries into lists one
at the time and append them together (a bit expensive but
far less expensive than taking one byte at the time).

The major slowdown in your example one is not that you take
one byte at the time, it is that you create a new binary for
the tail for every recursion. If you would keep the original
binary and the current offset you could use
<<_:Offset/binary, Byte, _Tail/binary>> in the matching and
just increment Offset for the next recursion.

orbitz@REDACTED (orbitz) writes:

> I am working with a protocol where the size of the following block is
> told to me so I can just convert the next N bytes to, say, a string.
> The problem is though, I'm trying to write this so it handles a stream
> properly, so in the binary I have could be all N bytes that I need, or
> something less than N. So at first I tried:
> 
> extract_string(Tail, 0, Res) ->
>   {ok, {string, Res}, Tail};
> extract_string(<<H, Tail/binary>>, Length, Res) ->
>   extract_string(Tail, Length - 1, lists:append(Res, [H]));
> extract_string(<<>>, Length, Res) ->
>   case dispatch_message() of
>     {decode, _, Data} ->
>       extract_string(Data, Length, Res)
>   end.
> 
> When the binary is empty but I still need more data it waits for more.
> I don't know if this is the proper idiom (it seems gross to me but I
> am unsure of how to do it otherwise).  This is incredibly slow though.
> With a long string that I need to extract it takes a lot of CPU and
> far too long.  So I decided to do:
> 
> extract_string(Data, Length, _) ->
>   <<String:Length/binary, Tail/binary>> = Data,
>   {ok, {string, binary_to_list(String)}, Tail}.
> 
> In terms of CPU and time this is much much better, but if I don't have
> all N bytes it won't work.  Any suggestions?

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



More information about the erlang-questions mailing list