[erlang-questions] current xml parsers

Michał Ptaszek michal.ptaszek@REDACTED
Tue Mar 27 15:25:22 CEST 2012


There was a bug in exml that has been fixed a couple of weeks
ago (buffer overflow in the NIF function), I tried it on WURFL
today and the results are as follows:

{ok, XML} = file:read_file("/tmp/wurfl.xml").
timer:tc(exml, parse, [XML]).
{3796645,
 {ok,{xmlelement,<<"wurfl">>,[],
                 [{xmlcdata,<<"\n  ">>},
                  {xmlelement,<<"version">>,[],
                              [{xmlcdata,<<"\n    ">>},
                               {xmlelement,<<"ver">>,[],
                                           [{xmlcdata,<<"2.3, db.scientiamobile.c"...>>}]},
                               {xmlcdata,<<"\n    ">>},
                               {xmlelement,<<"last_updated">>,[],
                                           [{xmlcdata,<<"Thu Nov 17 18:01"...>>}]},
                               {xmlcdata,<<"\n    ">>},
                               {xmlelement,<<"official_url">>,[],
                                           [{xmlcdata,<<"http://w"...>>}]},
                               {xmlcdata,<<"\n\t    ">>},
                               {xmlelement,<<"maintainers">>,[],
                                           [{xmlcdata,<<...>>},{xmlelement,...},{...}|...]},
                               {xmlcdata,<<"\n\t    ">>},
...

Best regards,
Michal Ptaszek


----- Original Message -----
> I think I tried that lib once, because I had the requirement to parse
> WURFL
> http://wurfl.sourceforge.net/
> It is a giant XML that has info about mobile devices. It worked fine
> with
> normal sized XMLs, but WURFL crashed it (and erlang along with it).
> 
> 
> Sergej
> 
> On Mon, Mar 26, 2012 at 3:34 PM, Michał Ptaszek <
> michal.ptaszek@REDACTED> wrote:
> 
> > Hi Roberto,
> >
> > you might want to be interested in looking at exml:
> > it's a very simple NIF-based parser built around
> > expat library:
> > https://github.com/paulgray/exml
> >
> > Basing on some of the simple benchmarks of my own it
> > should be 2-3 times faster than xmerl.
> >
> > Best regards,
> > Michal Ptaszek
> >
> > ----- Original Message -----
> > > Dear list,
> > >
> > > does someone have recent considerations on xml parsers in terms
> > > of
> > > memory
> > > footprint, parsing speed and stability?
> > >
> > > The ones I'm aware of are xmerl, erlsom [1] and the driver used
> > > in
> > > ejabberd
> > > (which unfortunately is GPL).
> > >
> > > I don't care about DTD validation.
> > >
> > > Thank you,
> > >
> > > r.
> > >
> > > [1] http://erlsom.sourceforge.net/erlsom.htm
> > >
> > > _______________________________________________
> > > erlang-questions mailing list
> > > erlang-questions@REDACTED
> > > http://erlang.org/mailman/listinfo/erlang-questions
> > >
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://erlang.org/mailman/listinfo/erlang-questions
> >
> 



More information about the erlang-questions mailing list