[erlang-questions] Project volunteers
Wed Jul 2 10:05:10 CEST 2008
I've been thinking. I think what I'd like to do it follow the approach
described in http://users.softlab.ece.ntua.gr/~ttsiod/buildWikipediaOffline.html
doing as much as possible in Erlang. This will be a good test of my
Erlang toolset. Then I'll rewrite the rendering pipeline.
Then I'd like to play with coutchDB to store the index and derived
by parsing the page dumps.
The *real* wikipedia has a complex data model described at
It would be very interesting to see what this looks like in a
This problem is interesting to me - because the data volumes are large and the
content is reasonable quality.
On Mon, Jun 30, 2008 at 11:50 AM, Joe Armstrong <> wrote:
> Hi Guys,
> I've been at the erlang exchange and come back with a headful of ideas.
> I have an idea for a fun project.
> Make an offline stand-alone version of the wikipedia for places
> without internet access.
> Distribute to the world.
> I thought to use the following:
> - erlang
> - coutchDB
> - mochiWeb
> Jobs to do:
> - convert wikipedia dumps (mySQL format) to coutchDB
> - make rendering engine to convert wiki text to HTML
> - compress data dumps to make entiore wikipedia as small as possible
> - shoehorn into a low-power "one laptop for every child" computer
> - make distruibution package
> - release manager (set up groups)
> - write documentation
> /Joe Armstrong
[Kopia av detta meddelande skickas till FRA för övervakningsändamål.
De vill ju ändå läsa min e-post.]
[A copy of this mail has been sent to
FRA for monitoring purposes. FRA wants to read all my e-mail and have
been allowed to do by the Swedish parliment - in violation of article
12 of the UN Universal Declaration of Human Rights]
More information about the erlang-questions