[erlang-questions] xmerl scan stream

Peter Sabaini peter@REDACTED
Tue Dec 9 01:10:11 CET 2008


Hm, as an afterthought -- this still doesn't solve the original problem, does 
it? 

Say I have this on my input stream:

% telnet localhost 2345
Trying 127.0.0.1...
Connected to localhost.local.
Escape character is '^]'.
<doc>
a
</doc>
<foo />

 ----- 

then I only get the <doc>a</doc> structure back as soon as <foo /> is entered, 
correct? 

Thanks,
peter.



On Tuesday 09 December 2008 00:30:42 Peter Sabaini wrote:
> On Monday 08 December 2008 23:53:59 Ulf Wiger wrote:
> > True, you can't really use it directly, but you can copy
> > the code. Basically, the read_chunk/2 function should
> > be replaced by something along the lines of:
> >
> > read_chunk(Sofar) ->
> >     receive
> >         {tcp, _Socket, Bin} ->
> >             {ok, iolist_to_binary([Sofar, Bin])};
> >         {tcp, closed, _} ->
> >             eof
> >     end.
>
> Ok...
>
> > (View this as pseudo code.)
> >
> > You should probably use gen_tcp:recv() instead, or
> > at least an {active, once} socket.
>
> At the moment, this is for "trusted" clients only, so I can code this
> rather liberally, without fear that somebody could abuse that -- is that
> what you meant?
>
> > But you need to
> > rewrite xmerl_eventp:stream/2 slightly.
>
> Ok, I'll try that and report any outcome, maybe other people find this
> useful too.
>
> Thanks,
> peter.
>
> > The complication, when you get down to it, is that the
> > stream continuation fun must take care not to break
> > up the stream in the wrong place. This is because xmerl
> > doesn't use a proper tokenizer, but does a one-pass
> > parse which relies rather heavily on pattern matching.
> >
> > This is what the find_good_split() function is for.
> >
> > BR,
> > Ulf W
> >
> > 2008/12/8 Peter Sabaini <peter@REDACTED>:
> > > On Monday 08 December 2008 23:09:39 Ulf Wiger wrote:
> > >> Hi Peter,
> > >>
> > >> Have you looked at the module xmerl_eventp in xmerl?
> > >>
> > >> You might even be able to use it directly.
> > >
> > > Yes, I suspected that this module might do what I need --
> > > unfortunately, being the thick-skulled newbie that I am, I haven't been
> > > able to figure out how... The docs here
> > > http://www.erlang.org/doc/man/xmerl_eventp.html are pretty succinct.
> > > Aren't the functions in xmerl_eventp for scanning files? Or could I use
> > > those also with a TCP socket?
> > >
> > > Thanks,
> > > peter.
> > >
> > >> BR,
> > >> Ulf W
> > >>
> > >> 2008/12/8 Peter Sabaini <peter@REDACTED>:
> > >> > Hi list,
> > >> >
> > >> > I am trying to get xmerl to parse a stream of data coming in via a
> > >> > TCP socket. The goal would be for xmerl to return xmlRecords as soon
> > >> > as one is complete.
> > >> >
> > >> > I use the continuation function option of xmerl and so far that
> > >> > works ok; unfortunately I only get an xmlRecord as soon as the next
> > >> > xml element starts. Is there a way to tell xmerl to "evaluate
> > >> > eagerly"?
> > >> >
> > >> > Below is the test code I used; any help much appreciated. Is this
> > >> > even possible? Or am I completely on the wrong track and should use
> > >> > a SAX model instead?
> > >> >
> > >> >  -- snip --
> > >> >
> > >> > -module(ap).
> > >> > -compile(export_all).
> > >> >
> > >> > start_server() ->
> > >> >    {ok, Listen} = gen_tcp:listen(2345, [binary, {packet, raw},
> > >> >                                         {reuseaddr, true},
> > >> >                                         {active, true}]),
> > >> >    spawn(fun() -> par_connect(Listen) end).
> > >> >
> > >> > par_connect(Listen) ->
> > >> >    {ok, _Socket} = gen_tcp:accept(Listen),
> > >> >    spawn(fun() -> par_connect(Listen) end),
> > >> >    io:format("par_c ~n", []),
> > >> >    X = xmerl_scan:string("", [{continuation_fun, fun continue/3}]),
> > >> >    io:format("X: ~p ~n", [X]).
> > >> >
> > >> > continue(Continue, Exception, GlobalState) ->
> > >> >    io:format("entered continue/3 ~n", []),
> > >> >    receive
> > >> >        {tcp, _Socket, Bin} ->
> > >> >            Str = binary_to_list(Bin),
> > >> >            io:format("got Str ~p ~n", [Str]),
> > >> >            Continue(Str, GlobalState);
> > >> >        {tcp_closed, _} ->
> > >> >            io:format("Server socket closed~n" ),
> > >> >            Exception(GlobalState)
> > >> >    end.
> > >> >
> > >> > main() ->
> > >> >    start_server().
> > >> >
> > >> >
> > >> >  -- snip --
> > >> >
> > >> > --
> > >> >  Peter Sabaini
> > >> >  http://sabaini.at/
> > >> >
> > >> >
> > >> > _______________________________________________
> > >> > erlang-questions mailing list
> > >> > erlang-questions@REDACTED
> > >> > http://www.erlang.org/mailman/listinfo/erlang-questions
> > >
> > > --
> > >  Peter Sabaini
> > >  http://sabaini.at/

-- 
  Peter Sabaini
  http://sabaini.at/
  




More information about the erlang-questions mailing list