Streaming Input
Håkan Stenholm
hakan.stenholm@REDACTED
Mon Feb 28 00:29:31 CET 2005
orbitz wrote:
> Thank you for your quick and insightful reply. I have one other
> question. the portion of my code where I check to see if the data is
> empty but I'm not done extracting the string so I wait for more data,
> is this how things are commonly handled?
A common erlang way is to spawn a gen_server (process) to handle
incoming data in a non-blocking manner. Depending on your application
you may then simply wait for a complete data block to arrive which
should simplify the parsing.
> I have several things like extract_string, extract_integer,
> extract_list etc, for dealing with the different datatypes of my
> format, and all of them require this little peice of code. it seems as
> though I should have some sort of central place where this is done and
> dispatched back to the function.
One way you could write a general purpose extract function would be to
supply a function as a argument:
extract(Data, Length, _, Fun) ->
DataLength = size(Data),
L = case DataLength >= Length of true -> Length;
false -> DataLength
end,
<<BinData:L/binary, Tail/binary>> = Data,
{ok, Fun(BinData), Tail}. %% apply function Fun to BinData
This may possibly be parameterized with a type:
extract(Data, Length, _, Fun, Type) ->
DataLength = size(Data),
L = case DataLength >= Length of true -> Length;
false -> DataLength
end,
<<BinData:L/binary, Tail/binary>> = Data,
{ok, {Type, Fun(BinData, Type)}, Tail}. %% supply type data e.g.
string | integer | ...
In your case, you may also simply do:
extract(Data, Length, _, Type) ->
DataLength = size(Data),
L = case DataLength >= Length of true -> Length;
false -> DataLength
end,
<<BinData:L/binary, Tail/binary>> = Data,
{ok, {Type, convert(BinData, Type)}, Tail}.
and supply a convert function with several clauses:
convert(Bin, list) ->
... ;
convert(Bin, integer) ->
... ;
.
.
.
convert(Bin, string) ->
... .
>
> Thankyou,
>
> Håkan Stenholm wrote:
>
>> orbitz wrote:
>>
>>> I am working with a protocol where the size of the following block
>>> is told to me so I can just convert the next N bytes to, say, a
>>> string. The problem is though, I'm trying to write this so it
>>> handles a stream properly, so in the binary I have could be all N
>>> bytes that I need, or something less than N. So at first I tried:
>>>
>>> extract_string(Tail, 0, Res) ->
>>> {ok, {string, Res}, Tail};
>>> extract_string(<<H, Tail/binary>>, Length, Res) ->
>>> extract_string(Tail, Length - 1, lists:append(Res, [H]));
>>> extract_string(<<>>, Length, Res) ->
>>> case dispatch_message() of
>>> {decode, _, Data} ->
>>> extract_string(Data, Length, Res)
>>> end.
>>
>>
>>
>> extract_string(Tail, 0, Res) ->
>> {ok, {string, lists:reverse(Res)}, Tail}; %% when done
>> reverse list back to intended order
>> extract_string(<<H, Tail/binary>>, Length, Res) ->
>> extract_string(Tail, Length - 1, [H | Res]); %% turn O(N)
>> operation into O(1) op.
>> extract_string(<<>>, Length, Res) ->
>> case dispatch_message() of
>> {decode, _, Data} ->
>> extract_string(Data, Length, Res)
>> end.
>>
>> This version will be much faster than the original version, because
>> appending elements to the end of a list is a O(N) operation which is
>> done N times (O(N^2)) - instead append to front of list (O(1)
>> operation) and reverse the list when your done with accumulating the
>> (Res) list (O(N)).
>>
>>>
>>> When the binary is empty but I still need more data it waits for
>>> more. I don't know if this is the proper idiom (it seems gross to
>>> me but I am unsure of how to do it otherwise). This is incredibly
>>> slow though. With a long string that I need to extract it takes a
>>> lot of CPU and far too long. So I decided to do:
>>>
>>> extract_string(Data, Length, _) ->
>>> <<String:Length/binary, Tail/binary>> = Data,
>>> {ok, {string, binary_to_list(String)}, Tail}.
>>
>>
>>
>> You probably want something like this:
>>
>> extract_string(Data, Length, _) ->
>> DataLength = size(Data), %% get length of Data
>> L = case DataLength >= Length of true -> Length;
>> false -> DataLength
>> end,
>> <<String:L/binary, Tail/binary>> = Data,
>> {ok, {string, binary_to_list(String)}, Tail}.
>>
>> This should be able to extract as much data as possible in a single
>> binary access - this should be slightly faster than my pervious
>> extract_string/3 update above.
>>
>>>
>>> In terms of CPU and time this is much much better, but if I don't
>>> have all N bytes it won't work. Any suggestions?
>>>
>>
>>
>>
>>
>
>
More information about the erlang-questions
mailing list