[erlang-questions] Project volunteers

Jan Lehnardt jan@REDACTED
Mon Jun 30 12:54:04 CEST 2008


Hey Joe,

This idea is a fun project very much indeed. And I can be of service
with any CouchDB related problem.

One issue I can see with CouchDB is that it trades in disk space
for anything else (reliability, fault tolerance, speed). So a high- 
volume
data project on a low-level (small disk) computer might not be the
best fit. With the raw data storage CouchDB will use less than two times
the amount of actual data. This might be a deal-breaker. Also, CouchDB
uses so called 'views' for fast data access via b-tree indexes and those
can add up to even more data in store.

MacOS X comes with a facility to create compressed disk images, maybe
this is available elsewhere. If we put CouchDB's (or any) data directory
onto a compressed disk image, we might get quite far with the data
compression.

On the web-server end: CouchDB already comes with MochiWeb and
lets you serve static data. What we can do here now is a small HTML+
JavaScript GUI that gets served from CouchDB that then can use Ajax
to pull the actual contents from CouchDB. The catch here would be
creating a proper web GUI without too much ajaxy fancyness and just
using Ajax for data retrieval. That would mean though, that we need to
either preprocess the Wikipdia contents into something the ajax GUI can
use or it would need to do the wiki->html conversion. Probably not a  
good
idea either way :)

In any case, CouchDB can be used as dumb data store just as well
with a custom MochiWeb frontend.

Cheers
Jan
--
PS: It's "CouchDB", no 't' in there. It stands for "Cluster Of  
Unreliable
Commodity Hardware" :-)

On Jun 30, 2008, at 11:50, Joe Armstrong wrote:

> Hi Guys,
>
> I've been at the erlang exchange and come back with a headful of  
> ideas.
>
> I have an idea for a fun project.
>
> Make an offline stand-alone version of the wikipedia for places
> without internet access.
> Distribute to the world.
>
> I thought to use the following:
>
>     - erlang
>     - coutchDB
>     - mochiWeb
>
> Jobs to do:
>
>    - convert wikipedia dumps (mySQL format) to coutchDB
>    - make rendering engine to convert wiki text to HTML
>    - compress data dumps to make entiore wikipedia as small as  
> possible
>    - shoehorn into a low-power "one laptop for every child" computer
>    - make distruibution package
>    - release manager (set up groups)
>    - write documentation
>
>
> /Joe Armstrong
>




More information about the erlang-questions mailing list