[erlang-questions] XML parser that works on binaries

Willem de Jong <>
Fri Nov 23 19:41:08 CET 2007


The new version of erlsom (released a couple of days ago on sourceforge) can
parse input in chunks, one chunk at a time. That way the problem with the
memory footprint can be solved.
Files of arbitrary size or a stream of data can be parsed.

The approach is similar to the approach that xmerl uses, with a 'fetch'
hook, as Ulf calls it.

An example how to do this for big files is included with the erlsom
distribution. It should be fairly straightforward (even for UTF-8 or UTF-16
encoded data).

Regards,
Willem

On Nov 23, 2007 6:09 PM, Joel Reymont <> wrote:

>
> On Nov 23, 2007, at 4:44 PM, Willem de Jong wrote:
>
> > Why do you want a parser that works on binaries?
>
>
> To minimize the memory footprint of an application, for example.
> ejabberd was a huge memory hog a year and a half ago since XML
> binaries received from the socket were converted to lists for
> processing. I was an even bigger memory hog on a x64 system for
> obvious reasons. I noticed that even then they were using an expat
> driver that could operate on binaries so they may have completed that
> conversion now.
>
>
> --
> http://wagerlabs.com
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20071123/0c9fa84d/attachment.html>


More information about the erlang-questions mailing list