Thursday, October 24, 2013

Apache LuceneTM Features

Lucene offers powerful features through a simple API:

Scalable, High-Performance Indexing

over 150GB/hour on modern hardware
small RAM requirements -- only 1MB heap
incremental indexing as fast as batch indexing
index size roughly 20-30% the size of text indexed

Powerful, Accurate and Efficient Search Algorithms

ranked searching -- best results returned first
many powerful query types: phrase queries, wildcard queries, proximity queries, range queries and more
fielded searching (e.g. title, author, contents)
sorting by any field
multiple-index searching with merged results
allows simultaneous update and searching
flexible faceting, highlighting, joins and result grouping
fast, memory-efficient and typo-tolerant suggesters
pluggable ranking models, including the Vector Space Model and Okapi BM25
configurable storage engine (codecs)

Cross-Platform Solution

Available as Open Source software under the Apache License which lets you use Lucene in both commercial and Open Source programs
100%-pure Java
Implementations in other programming languages available that are index-compatible

LuceneTM Tutorials

Search Tutorials

Search capabilities are grown in importance as the quantity of digital data has grown exponentially on computers and on the web. These tutorials focus on Apache Lucene, a fantastic library that can be used to build fast, powerful search functionality into applications.

Kumar's Blog