[erlang-questions] Rant: I hate parsing XML with Erlang

YC <>
Wed Oct 24 00:14:37 CEST 2007


Of course - others on the internet have thought of the same issues and have
blogged about it...

http://emacspeak.blogspot.com/2007/06/firebox-put-fox-in-box.html

Which can run firefox headless + having an REPL with firefox @ the same
time... the rest would just be figuring out the vocabs to talk to firefox
over socket...

And apparently you can do all that in emacs -
http://emacspeak.googlecode.com/svn/trunk/lisp/emacspeak-moz.el.

And the engine behind the REPL - http://beta.hyperstruct.net/projects/mozlab.


On 10/23/07, YC <> wrote:
>
> Agreed - utilizing firefox or IE will further allow you to handle
> javascript generated DOMs much more easily then having to write a javascript
> parser yourself, which will enable handling of a much larger sets of pages.
>
> But is this *easy* to do within Erlang?
>
> On 10/23/07, Joe Armstrong <> wrote:
> >
> > The point is (or was) that firefox has code to parse virtally any kind
> > of broken
> > warped incomprehensable html - letting firefox figure out the "meaning"
> > of
> > deeply crippled and totally incomprehensible html and then scanning the
> > result
> > (the generated DOM) seems a lot easier than figuring out how to parse
> > crippled HTML yourself - using other stuff as components to do what they
> > are
> > good at doesn't seem that crazy to me.
> >
> > /Joe
> >
> >
> >
> > On 10/23/07, Joel Reymont <> wrote:
> > >
> > > On Oct 23, 2007, at 4:09 PM, Joe Armstrong wrote:
> > >
> > > > You could then use Erlang as a coordination language controlling
> > > > a load of firefoxes on different machines, telling them to go get
> > > > pages and
> > > > scrape the pages for data which they send back to Erlang.
> > >
> > > This is nuts!!! /With all due respect to Joe/
> > >
> > > --
> > > http://wagerlabs.com
> > >
> > >
> > >
> > >
> > >
> > >
> > _______________________________________________
> > erlang-questions mailing list
> > 
> > http://www.erlang.org/mailman/listinfo/erlang-questions
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20071023/dc26afc7/attachment.html>


More information about the erlang-questions mailing list