Word Frequency Analysis

Johann Höchtl <>
Fri Dec 4 09:35:39 CET 2009


Hello!

I need to compute a word frequency analysis of a fairly large corpus. At 
present I discovered the disco database
http://discoproject.org/

which seems to include a tf-idf indexer. What about couchdb? I found an 
article that it fails rather quickly (somewhere between 100 and 1000 
wikipedia text pages)
http://knuthellan.com/2009/07/09/the-couchdb-indexer-lightweight-search-engine-in-hours/

Are there other erlang frameworks or can somebody provide me with a hint 
to another DBM system which naturally supports wortd frequncy analysis?

Thank you!

Regards,
  Johann


More information about the erlang-questions mailing list