Lucene

From Encoresoup - The Ultimate Guide to Free/Open Source Software

(Redirected from Apache Lucene)
Jump to: navigation, search
This article contains content from the Wikipedia article:
Lucene
history contributors
Lucene
Lucene logo
Developer: Apache Software Foundation
Stable release

2.4.0  (12 October 2008)

Genre: Search and index
License: Apache License 2.0
Website: http://lucene.apache.org

Lucene is a free/open source information retrieval library, originally created in Java by Doug Cutting. It is supported by the Apache Software Foundation and is released under the Apache Software License.

Lucene has been ported to programming languages including Delphi, Perl, C#, C++, Python, Ruby and PHP.

Contents

[edit] Features and common use

While suitable for any application which requires full text indexing and searching capability, Lucene has been widely recognized for its utility in the implementation of Internet search engines and local, single-site searching.

At the core of Lucene's logical architecture is the idea of a document containing fields of text. This flexibility allows Lucene's API to be independent of file format. Text from PDFs, HTML, Microsoft Word documents, as well as many others can all be indexed so long as their textual information can be extracted.

[edit] Lucene-based projects

Lucene itself is just an indexing and search library and does not contain crawling and HTML parsing functionality. The Apache project Nutch is based on Lucene and provides this functionality; the Apache project Solr is a fully-featured search server based on Lucene; Compass is a Java Search Engine Framework built on top of Lucene.

[edit] References

[edit] See also

  • Hadoop
  • Nutch
  • Solr
  • Compass
  • Hibernate search
  • Jaeksoft WebSearch

[edit] External links

Personal tools

Pico USB Flash Drive (8Gb) [ThinkGeek]Linksys NSLU2 Storage Link for USB 2.0 Disk Drives [Amazon] Credit Card Size Digital Video Player [ThinkGeek]