Streaming Input

Håkan Stenholm <>
Mon Feb 28 00:29:31 CET 2005


orbitz wrote:

> Thank you for your quick and insightful reply.  I have one other 
> question. the portion of my code where I check to see if the data is 
> empty but I'm not done extracting the string so I wait for more data, 
> is this how things are commonly handled?

A common erlang way is to spawn a gen_server (process) to handle 
incoming data  in a non-blocking manner. Depending on your application 
you may then simply wait for a complete data block to arrive which 
should simplify the parsing.

>   I have several things like extract_string, extract_integer, 
> extract_list etc, for dealing with the different datatypes of my 
> format, and all of them require this little peice of code. it seems as 
> though I should have some sort of central place where this is done and 
> dispatched back to the function.

One way you could write a general purpose extract function would be to 
supply a function as a argument:

extract(Data, Length, _, Fun) ->
DataLength = size(Data),
L = case DataLength >= Length of   true -> Length;
  false -> DataLength
end,
<<BinData:L/binary, Tail/binary>> = Data,
{ok, Fun(BinData), Tail}.        %% apply function Fun to BinData

This may possibly be parameterized with a type:

extract(Data, Length, _, Fun, Type) ->
DataLength = size(Data),
L = case DataLength >= Length of   true -> Length;
  false -> DataLength
end,
<<BinData:L/binary, Tail/binary>> = Data,
{ok, {Type, Fun(BinData, Type)}, Tail}.        %% supply type data e.g. 
string | integer | ...

In your case, you may also simply do:

extract(Data, Length, _, Type) ->
DataLength = size(Data),
L = case DataLength >= Length of   true -> Length;
  false -> DataLength
end,
<<BinData:L/binary, Tail/binary>> = Data,
{ok, {Type, convert(BinData, Type)}, Tail}.

and supply a convert function with several clauses:

convert(Bin, list) ->
... ;
convert(Bin, integer) ->
... ;
.
.
.
convert(Bin, string) ->
... .

>
> Thankyou,
>
> Håkan Stenholm wrote:
>
>> orbitz wrote:
>>
>>> I am working with a protocol where the size of the following block 
>>> is told to me so I can just convert the next N bytes to, say, a 
>>> string.  The problem is though, I'm trying to write this so it 
>>> handles a stream properly, so in the binary I have could be all N 
>>> bytes that I need, or something less than N. So at first I tried:
>>>
>>> extract_string(Tail, 0, Res) ->
>>>  {ok, {string, Res}, Tail};
>>> extract_string(<<H, Tail/binary>>, Length, Res) ->
>>>  extract_string(Tail, Length - 1, lists:append(Res, [H]));
>>> extract_string(<<>>, Length, Res) ->
>>>  case dispatch_message() of
>>>    {decode, _, Data} ->
>>>      extract_string(Data, Length, Res)
>>>  end.
>>
>>
>>
>> extract_string(Tail, 0, Res) ->
>> {ok, {string, lists:reverse(Res)}, Tail};           %% when done 
>> reverse list back to intended order
>> extract_string(<<H, Tail/binary>>, Length, Res) ->
>> extract_string(Tail, Length - 1, [H | Res]);      %% turn O(N) 
>> operation into O(1) op.
>> extract_string(<<>>, Length, Res) ->
>> case dispatch_message() of
>>   {decode, _, Data} ->
>>     extract_string(Data, Length, Res)
>> end.
>>
>> This version will be much faster than the original version, because 
>> appending elements to the end of a list is a O(N) operation  which is 
>> done N times (O(N^2)) - instead append to front of list (O(1) 
>> operation) and reverse the list when your done with accumulating the 
>> (Res) list (O(N)).
>>
>>>
>>> When the binary is empty but I still need more data it waits for 
>>> more.  I don't know if this is the proper idiom (it seems gross to 
>>> me but I am unsure of how to do it otherwise).  This is incredibly 
>>> slow though.  With a long string that I need to extract it takes a 
>>> lot of CPU and far too long.  So I decided to do:
>>>
>>> extract_string(Data, Length, _) ->
>>>  <<String:Length/binary, Tail/binary>> = Data,
>>>  {ok, {string, binary_to_list(String)}, Tail}.
>>
>>
>>
>> You probably want something like this:
>>
>> extract_string(Data, Length, _) ->
>> DataLength = size(Data),                 %% get length of Data
>> L = case DataLength >= Length of   true -> Length;
>>   false -> DataLength
>> end,
>> <<String:L/binary, Tail/binary>> = Data,
>> {ok, {string, binary_to_list(String)}, Tail}.
>>
>> This  should be able to extract as much data as possible in a single 
>> binary access - this should be slightly faster than my pervious 
>> extract_string/3 update above.
>>
>>>
>>> In terms of CPU and time this is much much better, but if I don't 
>>> have all N bytes it won't work.  Any suggestions?
>>>
>>
>>
>>
>>
>
>




More information about the erlang-questions mailing list