[erlang-questions] wkipedia rendering engine

Joe Armstrong erlang@REDACTED
Mon Jun 30 13:40:28 CEST 2008


On Mon, Jun 30, 2008 at 1:36 PM, Jan Lehnardt <jan@REDACTED> wrote:
> On Jun 30, 2008, at 13:23, Joe Armstrong wrote:
>>
>> Is there a REST interface so that I can retreive the latest version of
>> the MetaWiki markup for a specific page with, for example,
>> a wget command.
>
> You can get bulk dumps
> http://en.wikipedia.org/wiki/Wikipedia:Database_download#Where_do_I_get...
>
> Why would you do individual scraping? In order to keep up to date with
> changes that happened between the last dump and now()?
>

To get a few test cases to test my parser on *before* download the entire thing.

Also I suspect the dumps are in MySQL format with xml junk - so it might not be
a trival job to extract the raw data. I (presumably) will have to
install MySQL and
turn some XML stuff into the raw data (just guessing here) - thought
that could be a job for a
volunteer :-)

/Joe


> Cheers
> Jan
> --
>
>> Has anybody made an erlang interface to scrape individual pages from
>> the wikipedia - or to bulk convert the entire
>> wikipedia to erlang terms :-)
>>
>> /Joe
>>
>>
>>
>> On Mon, Jun 30, 2008 at 11:39 AM, Joe Armstrong <erlang@REDACTED> wrote:
>>>
>>> Hi,
>>>
>>> I was at the erlang exchange and heard the *magnificant*  talk
>>>
>>> "Building a transactional distributed data store with Erlang", by
>>> Alexander Reinefeld.
>>>
>>> I'll be blogging this as soon as I have the URL of the video of the talk.
>>>
>>> (in advance of this there was talk at the google conference on
>>> scalability
>>>
>>>
>>> http://video.google.com/videoplay?docid=-6526287646296437003&q=erlang+scalable&ei=cZ9oSLiDNIiCiwLL9fGwCA&hl=en
>>>
>>> oh and they also seem to have won the SCALE 2008 prize at the
>>> CCGrid conferense in Lyon but there is zero publicity about this AFAICS
>>> )
>>>
>>> We (collectively) promised to help Alexander - I promised to provide him
>>> with a
>>> rendering engine (in Erlang) for the wikipedia markup language.
>>>
>>> Before I start hacking has anybody done this before?
>>>
>>> /Joe Armstrong
>>>
>>
>>
>>
>> --
>> fra@REDACTED; ingvar.akesson@REDACTED
>>
>> [Kopia av detta meddelande skickas till FRA för övervakningsändamål.
>> De vill ju ändå läsa min e-post.]
>>
>> [A copy of this mail has been sent to
>> FRA for monitoring purposes. FRA wants to read all my e-mail and have
>> been allowed to do by the Swedish parliment - in violation of article
>> 12 of the UN Universal Declaration of Human Rights]
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://www.erlang.org/mailman/listinfo/erlang-questions
>>
>
>



-- 
fra@REDACTED; ingvar.akesson@REDACTED

[Kopia av detta meddelande skickas till FRA för övervakningsändamål.
De vill ju ändå läsa min e-post.]

[A copy of this mail has been sent to
FRA for monitoring purposes. FRA wants to read all my e-mail and have
been allowed to do by the Swedish parliment - in violation of article
12 of the UN Universal Declaration of Human Rights]



More information about the erlang-questions mailing list