Parsing infinite streams style

Eric Newhuis enewhuis@REDACTED
Thu Mar 4 21:00:52 CET 2004


Try something like this...  Follow the Fragment binary around in the 
code in the attached parser.  Alternatively you could use a state 
machine.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: gp_parse_lib.erl
Type: application/octet-stream
Size: 24464 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20040304/aa1cd8aa/attachment.obj>
-------------- next part --------------



On Mar 4, 2004, at 1:41 PM, Edmund Dengler wrote:

> Hi all!
>
> A bit of continuation on my last question concerning style. I have an
> external executable that will be feeding binary data to my Erlang
> processes via a port. While I am parsing the data, I may need more 
> bytes
> to continue (as the source is an "infinite" stream of bytes). What is 
> the
> accepted methodology for doing this:
>
> (1) Parse what I have, if I don't have enough, cause an error to occur,
> and return to the place I started from (or at least, return as much as 
> I
> could parse, and the remainder). Basically, I am parsing the next 
> "chunk"
> of data, and returning to the start point. Ie:
>
>   loop
>     receive more bytes (& append to ones we currently have)
>     while parse next chunk is successful
>       call processing function
>
> If I run out of bytes during parsing, I keep adding onto the list I 
> have.
> _But_, I spend a lot of work reparsing what I have already done. To do
> this efficiently, I would need some kind of continutation mechanism to 
> say
> "here are more bytes, continue where you left off", which I don't 
> believe
> Erlang has.
>
> (2) Have some kind of lazy semantics of "get more bytes, I need it 
> now".
>
> If I do (2), I obviously need to pass along functions to call as I 
> match
> each grouping (basically, "process the stuff we have matched so far, 
> for
> each chunk, call the processing function") along with some mechanism to
> fetch more bytes. It also means that at every stage of the parse, I 
> need
> to check to see if I have enough bytes, and if I don't, call the "get 
> more
> bytes, and try again", complicating my code (rather than the simple
> "don't have enough, fail" model of (1)).
>
> I guess I could start to build a framework ala the FSM or ASN stuff, 
> that
> would do this wrapping for me, though it seems it would be a lot of 
> work
> and in the end would be the correct way (and obviously would allow me 
> to
> specify a DSL that makes specifying the patterns better; definitely the
> approach I would take if using Scheme or Lisp).
>
> Is there a current style/methodology I should be looking at to do the
> above? What is the "Erlang way"?
>
> Thanks!
> Ed
>


More information about the erlang-questions mailing list