Thursday, October 24, 2013

Apache LuceneTM Features

Lucene offers powerful features through a simple API:

Scalable, High-Performance Indexing

  • over 150GB/hour on modern hardware
  • small RAM requirements -- only 1MB heap
  • incremental indexing as fast as batch indexing
  • index size roughly 20-30% the size of text indexed

Powerful, Accurate and Efficient Search Algorithms

  • ranked searching -- best results returned first
  • many powerful query types: phrase queries, wildcard queries, proximity queries, range queries and more
  • fielded searching (e.g. title, author, contents)
  • sorting by any field
  • multiple-index searching with merged results
  • allows simultaneous update and searching
  • flexible faceting, highlighting, joins and result grouping
  • fast, memory-efficient and typo-tolerant suggesters
  • pluggable ranking models, including the Vector Space Model and Okapi BM25
  • configurable storage engine (codecs)

Cross-Platform Solution

LuceneTM Tutorials


Search Tutorials


Search capabilities are grown in importance as the quantity of digital data has grown exponentially on computers and on the web. These tutorials focus on Apache Lucene, a fantastic library that can be used to build fast, powerful search functionality into applications.
  1. How do I use Lucene to index and search text files?
  2. How do I delete a document from a Lucene index using the value of a field?
  3. How do I optimize a Lucene index after deleting documents from the index?
  4. How do I use an index in memory?
  5. How do control how much of a document is indexed?
  6. How do I convert a file system index to a memory index?
  7. How do I search an index for a term?
  8. How do I search an index for a prefix?
  9. How do I perform a range query?
  10. How do I perform a wildcard query?
  11. How do I combine queries with a boolean query?
  12. How do I query for words near each other with a phrase query?

No comments: