big erlang web solution
Mon Jun 26 16:35:05 CEST 2006
> (3) full text search and index maintenance - docs must be full text
> searchable by type (which would imply a file directory). Index of
> text search must be automatic as docs are added/removed/modified.
> This part of the system must be zero admin (or close to it). I don't
> care much about the size of index files; disk space is cheap. I care
> more about fast performance and low memory footprint of queries,
> usefulness of query results, and low admin of entire search system.
> (4) search results must be "google-like". (a) results contain enough
> highlighted context to the original search to let the user know which
> item in the results are worth digging into. (b) queries have
> "continuations". meaning I don't have to retrieve all 100,000
> matching results in one chunk just to show the user the first three
> pages. This aspect must have low memory consumption.
ke han, did you take a look at tsearch2 for Postgres?
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/. It requires
you to keep your text in the database, but it's very low maintenance
and it has google-like results.
Another option is to use Lucene with j/interface. With lucene, you can
store just the index, but I'm not sure how changes to the file system
would cascade into lucene in an automated manner. This is where
tsearch2 shines IMO.
More information about the erlang-questions