[erlang-questions] XML parser that works on binaries
Willem de Jong
Fri Nov 23 19:41:08 CET 2007
The new version of erlsom (released a couple of days ago on sourceforge) can
parse input in chunks, one chunk at a time. That way the problem with the
memory footprint can be solved.
Files of arbitrary size or a stream of data can be parsed.
The approach is similar to the approach that xmerl uses, with a 'fetch'
hook, as Ulf calls it.
An example how to do this for big files is included with the erlsom
distribution. It should be fairly straightforward (even for UTF-8 or UTF-16
On Nov 23, 2007 6:09 PM, Joel Reymont <> wrote:
> On Nov 23, 2007, at 4:44 PM, Willem de Jong wrote:
> > Why do you want a parser that works on binaries?
> To minimize the memory footprint of an application, for example.
> ejabberd was a huge memory hog a year and a half ago since XML
> binaries received from the socket were converted to lists for
> processing. I was an even bigger memory hog on a x64 system for
> obvious reasons. I noticed that even then they were using an expat
> driver that could operate on binaries so they may have completed that
> conversion now.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the erlang-questions