Parsing infinite streams style
Eric Newhuis
enewhuis@REDACTED
Thu Mar 4 21:00:52 CET 2004
Try something like this... Follow the Fragment binary around in the
code in the attached parser. Alternatively you could use a state
machine.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gp_parse_lib.erl
Type: application/octet-stream
Size: 24464 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20040304/aa1cd8aa/attachment.obj>
-------------- next part --------------
On Mar 4, 2004, at 1:41 PM, Edmund Dengler wrote:
> Hi all!
>
> A bit of continuation on my last question concerning style. I have an
> external executable that will be feeding binary data to my Erlang
> processes via a port. While I am parsing the data, I may need more
> bytes
> to continue (as the source is an "infinite" stream of bytes). What is
> the
> accepted methodology for doing this:
>
> (1) Parse what I have, if I don't have enough, cause an error to occur,
> and return to the place I started from (or at least, return as much as
> I
> could parse, and the remainder). Basically, I am parsing the next
> "chunk"
> of data, and returning to the start point. Ie:
>
> loop
> receive more bytes (& append to ones we currently have)
> while parse next chunk is successful
> call processing function
>
> If I run out of bytes during parsing, I keep adding onto the list I
> have.
> _But_, I spend a lot of work reparsing what I have already done. To do
> this efficiently, I would need some kind of continutation mechanism to
> say
> "here are more bytes, continue where you left off", which I don't
> believe
> Erlang has.
>
> (2) Have some kind of lazy semantics of "get more bytes, I need it
> now".
>
> If I do (2), I obviously need to pass along functions to call as I
> match
> each grouping (basically, "process the stuff we have matched so far,
> for
> each chunk, call the processing function") along with some mechanism to
> fetch more bytes. It also means that at every stage of the parse, I
> need
> to check to see if I have enough bytes, and if I don't, call the "get
> more
> bytes, and try again", complicating my code (rather than the simple
> "don't have enough, fail" model of (1)).
>
> I guess I could start to build a framework ala the FSM or ASN stuff,
> that
> would do this wrapping for me, though it seems it would be a lot of
> work
> and in the end would be the correct way (and obviously would allow me
> to
> specify a DSL that makes specifying the patterns better; definitely the
> approach I would take if using Scheme or Lisp).
>
> Is there a current style/methodology I should be looking at to do the
> above? What is the "Erlang way"?
>
> Thanks!
> Ed
>
More information about the erlang-questions
mailing list