[erlang-questions] [ANN] Map/Reduce in Erlang and Python - Disco 0.1
Ville H Tuulos
Thu Sep 11 20:00:33 CEST 2008
ext Bob Ippolito wrote:
> Why the choice of SCGI? It seems like it would be a lot simpler not to
> require an external web server. Erlang does HTTP on its own pretty
> well, in my experience :)
Thanks Bob for the comment (: I have used Mochiweb quite happily(*) in
another Erlang project of mine. There are people already working on
Disco to make it work with Mochiweb.
The web server is mainly used to serve large files to workers; SCGI is
just used to forward control requests to the Erlang process. Using an
external web server for IO intensive jobs was a safe choice in the first
place. I'd be happy to replace it with, say, Mochiweb, if it handle the
(*) I think I had a problem with Mochiweb: It uses the raw mode for
sockets, and it seems that there's a 16M limit for gen_tcp:recv() in
If I remember correctly, this caused any HTTP POST requests larger than
16M fail. Please correct me if I'm wrong.
> On Thu, Sep 11, 2008 at 2:15 AM, Ville H Tuulos
> <> wrote:
>> Hi all,
>> I am happy to announce the availability of Disco (as already featured in
>> Reddit, Hacker News etc.), an open-source implementation of the
>> Map/Reduce framework for distributed computing. Its
>> core is written in Erlang but users typically write jobs in Python.
>> Find the project site at
>> or see the source code right away at
>> We at Nokia Research in Palo Alto have been using it successfully for
>> data mining, building probabilistic models, and full-text indexing of
>> hundreds of gigabytes of real-world data on hundreds of CPUs in
>> parallel. If you don't have a spare cluster available, we provide a
>> script that sets up a working cluster automatically on the Amazon's EC2
>> It has been a pleasure to use Erlang to implement the job scheduler
>> and other core components of the system. It uses SCGI to provide a web
>> interface through an external web server, the slave module to start
>> Erlang VMs on slave nodes, and normal port commands to launch Python
>> workers on the nodes.
>> Disco is released under the BSD license. The system is still young,
>> there are known bugs, and there is still work to be done on scalability
>> issues as well. You're very welcome to try out the system, give
>> feedback, and develop the system with us.
>> I'll be at the ICFP / Erlang Workshop in Victoria, so if you're
>> attending I'd be happy to show a demo and have a chat with you about Disco.
>> Ville Tuulos
>> Member of Research Staff
>> Nokia Research Center
>> Palo Alto
>> erlang-questions mailing list
More information about the erlang-questions