[erlang-questions] xmerl still slow?

Erik Stenman Erik.Stenman@REDACTED
Wed Jan 31 21:45:10 CET 2007


bryan rasmussen wrote:
[...]
 > There must be a high end of how long it takes to parse something that
 > you consider acceptable, and it is highly likely that five times
 > longer than that high end is considered unacceptable.

OK, I was too simplistic in my statement in order to make
a point briefly and with a hint of humor.

There are actually several problems with the original question
and the subject of the thread: "xmerl still slow?".

Joel worries that xmerl is slow based on an off-hand remark
that Chandru made in January 2005:
  "Using expat is about 5 times faster than xmerl in our
   crude measurements."

Based on this Joel asks:
  "Is xmerl still considered slow? Have there been any improvements
   over the past couple of years?"

I got the impression he was setting out on a new project and
started by looking for the fastest XML-parser.
I tried to point out that this is the wrong question to ask by saying:
  "Show me an application where this performance difference matter,
   and I'll show you an application that shouldn't use XML."

What I meant, but failed to mention was:

  1. I don't consider xmerl slow.
  2. I have no idea whether eXpat is 5 times faster or not.
     And I don't think anyone else has a clear idea either.
  3. First write your application using the simplest methods
     available. (E.g. if you need xml in Erlang use xmerl in
     stead of a linked in driver.) Then worry about performance.
     You very seldom know the performance needs of your application,
     nor the performance of your implementation before you have
     written at least a prototype.
  4. If xmerl is your performance bottleneck then you probably
     have bigger problems than choosing your xml-parser implementation.
  5. I don't think XML is the right thing(tm).
     If you can avoid XML then do it, but don't forget point 3,
     "use the simplest methods available". If XML is the simplest
     method, then by all means use it.

[...]
 > I suppose that we differ in that I think this difference will
 > come up quite a bit and you think it will only exist so few times as
 > to be negligible.

Yes.
My point is that this difference is something you should try to neglect.
You should only worry about performance if you really really have to.
If you *have* to worry about performance then you should probably
first look at your protocol and your data representation, then at your
implementation.

And if you in the end find that xmerl is too slow, then you could try to
improve the performance of xmerl, but apparently no one has felt
the need to do that so far.


/Erik






More information about the erlang-questions mailing list