Word Frequency Analysis

Johann Höchtl <>
Fri Dec 4 17:57:03 CET 2009


Hello!

I need to compute a word frequency analysis of a fairly large corpus. At
present I discovered the disco database
http://discoproject.org/

which seems to include a tf-idf indexer. What about couchdb? I found an
article that it fails rather quickly (somewhere between 100 and 1000
wikipedia text pages)
http://knuthellan.com/2009/07/09/the-couchdb-indexer-lightweight-search-engine-in-hours/

Are there other erlang frameworks or can somebody provide me with a hint
to another DBM system which naturally supports wortd frequncy analysis?

Thank you!

Regards,
  Johann


More information about the erlang-questions mailing list