[erlang-questions] [ANN] Map/Reduce in Erlang and Python - Disco 0.1

Ville H Tuulos ville.h.tuulos@REDACTED
Thu Sep 11 20:55:10 CEST 2008


ext Zvi wrote:

> nice work,
> It's not clear from the FAQ, if it's possible to write map and reduce
> functions in other languages, except Python. Can they be implemented in
> native Erlang and/or C/C++ compiled executable?

Yes. There's an external interface for that purpose:

http://discoproject.org/doc/external.html

However, this approach still uses Python as a middleman so it's mainly 
useful for implementing CPU-intensive tasks in C/C++/OCaml etc.

It is possible to get rid of Python totally and replace it with another 
language without changing anything in the Disco's Erlang core. No one 
has tried that yet --- feel free to be the first (:

For a pure Erlang Map/Reduce implementation, you can use something 
considerably simpler than Disco, like we know from "Programming Erlang".


Ville


> Ville H Tuulos wrote:
>>
>> Hi all,
>>
>> I am happy to announce the availability of Disco (as already featured in
>> Reddit, Hacker News etc.), an open-source implementation of the
>> Map/Reduce framework for distributed computing. Its
>> core is written in Erlang but users typically write jobs in Python.
>>
>> Find the project site at
>>
>> http://discoproject.org
>>
>> or see the source code right away at
>>
>> http://github.com/tuulos/disco/tree/master
>>
>> We at Nokia Research in Palo Alto have been using it successfully for
>> data mining, building probabilistic models, and full-text indexing of
>> hundreds of gigabytes of real-world data on hundreds of CPUs in
>> parallel. If you don't have a spare cluster available, we provide a
>> script that sets up a working cluster automatically on the Amazon's EC2
>> cloud.
>>
>> It has been a pleasure to use Erlang to implement the job scheduler
>> and other core components of the system. It uses SCGI to provide a web
>> interface through an external web server, the slave module to start
>> Erlang VMs on slave nodes, and normal port commands to launch Python
>> workers on the nodes.
>>
>> Disco is released under the BSD license. The system is still young,
>> there are known bugs, and there is still work to be done on scalability
>> issues as well. You're very welcome to try out the system, give
>> feedback, and develop the system with us.
>>
>> I'll be at the ICFP / Erlang Workshop in Victoria, so if you're 
>> attending I'd be happy to show a demo and have a chat with you about
>> Disco.
>>
>>
>> Ville Tuulos
>> Member of Research Staff
>> Nokia Research Center
>> Palo Alto
>>
>>
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://www.erlang.org/mailman/listinfo/erlang-questions
>>
>>
> 




More information about the erlang-questions mailing list