[erlang-questions] xmerl producing atoms
igwan
igwan@REDACTED
Tue Mar 13 06:53:35 CET 2007
Dear list,
I'm currently working on a system that parses user-provided XML data
using xmerl. What I find is a problem is that xmerl produces new atoms
for every element name or namespace URI it parses from the input. This
is not a big deal if you work with a limited number of schemas and
"internal" users, but when you have to accept input from the internet,
your node could be quickly taken down by filling up the atom table. The
documentation ("Efficiency Guide" / 7.1 "Memory") says that atoms are
not garbage-collected.
I came to another post from 2005 describing this issue :
http://www.erlang.org/ml-archive/erlang-questions/200502/msg00070.html
I have looked at alternative parsers like ErlSom, but it seems to work
against a pre-compiled schema and for my application, I have to accept
any XML document, without knowing its structure. Plus, I make heavy use
of hook_fun in xmerl_scan.
My questions : Is there a (possibly-undocumented) option for telling
xmerl to produce binaries or strings instead of atoms ? Or are there
plans to garbage-collect atoms in the near future ?
Thanks in advance,
igwan
More information about the erlang-questions
mailing list