[erlang-questions] Reading large (1GB+) XML files.

Kenneth Lundin kenneth.lundin@REDACTED
Thu Aug 16 09:25:08 CEST 2007


Hi,

There is already support handling infinite streams in the xmerl application.
You should use the xmerl_eventp module and the functions there.
The documentation is here: http://www.erlang.org/doc/man/xmerl_eventp.html

We are using it ourselves , it works ok, but I admit that the
documentation is a bit
sparse.

/Kenneth (Erlang/OTP team at Ericsson)

On 8/15/07, Joe Armstrong <erlang@REDACTED> wrote:
> Interesting - I've been writing some new XML libraries and handling
> infinite streams (Well very large) is one of the problems I've been
> thinking about
>
> I'll poke around tomorrow and send you some code that might help
>
> /Joe Armstrong
>
> On 8/15/07, Patrik Husfloen <husfloen@REDACTED> wrote:
> > I've been trying to learn erlang for a while, and I recently found
> > what I thought to be an easy starter project. I currently have a
> > simple application that reads data from a couple of Xml files using
> > SAX, and inserts it using a rpc over http.
> >
> > I'm not sure about the terminology here, I've been stuck in OO land
> > for so long that everything looks like an object, but here's what I'm
> > thinking: One thread reading the xmls and piecing together the data,
> > and then handing off each record to a pool of workers that issue the
> > http requests, or, maybe the xml-reading part could just spawn a new
> > thread for each record it reads, and ensure that only X are running at
> > the most?
> >
> > The http request was easy enough to get working, but I'm having
> > trouble with reading the xml, I used xmerl_scan:file to parse the
> > file, but that loads the file into memory before starting to process.
> >
> > I took a look at Erlsom, and it's SAX reader examples, but that read
> > the entire file into a binary before passing it off to the Xml reader.
> >
> >
> > Thanks,
> >
> > Patrik
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://www.erlang.org/mailman/listinfo/erlang-questions
> >
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>



More information about the erlang-questions mailing list