[erlang-questions] Rant: I hate parsing XML with Erlang

Peter C. Chapin Peter.Chapin@REDACTED
Tue Oct 23 17:47:26 CEST 2007


Anders Nygren wrote:

> I tried to use it a couple of years ago and it was of no help to me since
> it actually requires correct HTML. Which the sites I tried to scrape
> refused to provide, (missing end tags and so on).
>   

It is not necessarily incorrect for an HTML document to have missing end
tags. For some elements the end tag is optional. Trying to parse an HTML
document with an XML parser is not likely to work well, however. One
must either use an SGML parser or make sure you only point your XML
parser at an XHTML document.

Peter

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Peter.Chapin.vcf
Type: text/x-vcard
Size: 308 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20071023/51a412d3/attachment.vcf>


More information about the erlang-questions mailing list