[erlang-questions] Full text search in KV-storage.

Jesper Louis Andersen jesper.louis.andersen@REDACTED
Sat Sep 24 17:27:14 CEST 2011


On Sat, Sep 24, 2011 at 01:58, Oleg Chernyh <erlang@REDACTED> wrote:

> I'm far from linguistics and full text search engines, do your input was
> particulary useful.
> And what about my idea that I have briefly described?

I tried to be adept and avoid answering those parts because I know
very little about it. It looks like you do some stemming of words to
find their stem and then you index those. But I don't know what you do
to achieve it. If you pick a language, like english there are probably
two ways to go: 1. Form a set of rules and apply those rules to
"normalize"/"canonicalize"/"extract-the-stem". 2. Play the google
game: If you have enough data to mine statistically, figure out what
the stems are via machine learning.

Historically, judging by papers at AI/ML conferences, it looks like
option 2 wins :P

-- 
J.



More information about the erlang-questions mailing list