[erlang-questions] 700% speedup

Willem de Jong w.a.de.jong@REDACTED
Fri Jun 22 23:51:00 CEST 2007


It is a strange sory. The author claims to have achieved very good results
using Erlang to parse a very big (35Mbyte) XML file (an Itunes Music Library
file). He suggests that he uses lots of processes to do this.

It made me curious, and I decided to do some tests.  I used my 1.7 GHz
laptop with 1GB of memory, running Windows XP.

- Parsing an Itunes file of 4Mbyte takes about 4 seconds with the SAX parser
that is the basis of Erlsom (if you let the callback function do something
trivial).

- Parsing the file with Erlsom (which validates it against an XSD and
translates it to records) takes about 5 seconds.

- Parsing the file with Xmerl takes about 8 seconds.

I found an article on parsing the Itunes library using mono
http://www.xml.com/pub/a/2004/11/03/itunes.html). On an 800MHz powerbook
parsing a 2.5Mbyte file apparently took 9 seconds, so I would say that
Erlang doesn't look bad.

Surprisingly, loading the file into Microsoft Internet Explorer takes more
than a minute...

If things would scale lineary, parsing the 35Mbyte file should take about 40
to 80 seconds, which is about twice as fast as what the author of the blog
claims to have achieved (on another machine, obviously, so comparing these
figures may not make a lot of sense).

Unfortunately, these tests fail miserably - Erlang crashes. On my machine I
cannot translate a file (binary) of this size to a list. I have to say that
I was a bit disappointed... Is there a way to fix this?

Willem.


On 6/20/07, Brad Anderson <brad@REDACTED> wrote:
>
> I came across this blog today...
>
> http://www.sungnyemun.org/wordpress/?p=323
>
> BA
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20070622/30344536/attachment.htm>


More information about the erlang-questions mailing list