Tuesday 28 July 2009

Lucene stuffs

These days, I am trying to use Lucene for my own research purpose. I figure out here some stuffs that may be relevant and useful:

Lucene in general: http://lucene.apache.org
-> you can find out more detail on this site!

Lucene Index Toolbox (Luke): http://www.getopt.org/luke/
-> This tool is very helpful for us to deal with functionality of Lucene search engine. It supports index, document browsing, search, ... with graphical UI (cross-platform).

Great articles:
-> In this article, the author tried to implement his own summarizer based mainly on two simple summarization algorithms, namely Classifier4J (C4J) and Open Text Summarizer (OTS) using Lucene, an open-source source engine API.

2) Lucene Analyzer, Tokenizer and TokenFilter: http://mext.at/?p=26
-> how to use analyzer, tokenizer, filter in Lucene

3) Lucene Indexing and Document Scoring: (googling with the keyword "lucene indexing and document scoring")
-> contains some basic concepts and definitions in Lucene under comprehensive explanation.

4) Understanding Lucene Scoring: http://www.opensourcereleasefeed.com/article/show/understanding-lucene-scori

5) Lucene Query Syntax: http://lucene.apache.org/java/2_3_2/queryparsersyntax.html (replace the version "2_3_3" if you are using newer ones)

(to be continued ...)

--
Cheers,
Vu

No comments:

Post a Comment