[erlang-questions] Reading large (1GB+) XML files.

Kevin A. Smith kevin@REDACTED
Wed Aug 15 21:19:10 CEST 2007


I'd also be interested to hear how experienced Erlangers handle this.  
I'm trying to do some heavy SAX parsing as well and it'd be nice to  
not have to load the entire file into memory at once.

--Kevin
On Aug 15, 2007, at 2:23 PM, Patrik Husfloen wrote:

> I've been trying to learn erlang for a while, and I recently found
> what I thought to be an easy starter project. I currently have a
> simple application that reads data from a couple of Xml files using
> SAX, and inserts it using a rpc over http.
>
> I'm not sure about the terminology here, I've been stuck in OO land
> for so long that everything looks like an object, but here's what I'm
> thinking: One thread reading the xmls and piecing together the data,
> and then handing off each record to a pool of workers that issue the
> http requests, or, maybe the xml-reading part could just spawn a new
> thread for each record it reads, and ensure that only X are running at
> the most?
>
> The http request was easy enough to get working, but I'm having
> trouble with reading the xml, I used xmerl_scan:file to parse the
> file, but that loads the file into memory before starting to process.
>
> I took a look at Erlsom, and it's SAX reader examples, but that read
> the entire file into a binary before passing it off to the Xml reader.
>
>
> Thanks,
>
> Patrik
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions




More information about the erlang-questions mailing list