# apache/lucene-solr

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/apache-lucene-solr).**

4,357 stars · 2,596 forks · Apache-2.0

## Links

- GitHub: https://github.com/apache/lucene-solr
- Homepage: https://lucene.apache.org/
- awesome-repositories: https://awesome-repositories.com/repository/apache-lucene-solr.md

## Topics

`backend` `information-retrieval` `java` `lucene` `nosql` `search` `search-engine` `solr`

## Description

This project is a full text search engine and enterprise search infrastructure designed for indexing and retrieving large sets of documents. It provides a comprehensive framework for information discovery using ranked results and linguistic analysis.

The system integrates high-dimensional vector similarity search for semantic retrieval alongside traditional full-text capabilities. It distinguishes itself through support for geospatial data retrieval, multilingual text processing, and a search suggestion workflow that includes typo-tolerant query completion and spellchecking.

The platform covers a broad range of search and indexing capabilities, including complex query execution, facet count aggregation, and result grouping. It handles text analysis through tokenization and normalization, while offering specialized tools for document joining, search hit highlighting, and custom scoring based on recency and distance.

A Python search interface is available to expose indexing and querying functionality to external programming environments.

## Tags

### Data & Databases

- [Full Text Search](https://awesome-repositories.com/f/data-databases/full-text-search.md) — Provides a comprehensive engine for indexing and retrieving large sets of documents using ranked lexical search results. ([source](https://lucene.apache.org/core/))
- [Full-Text Inverted Indexes](https://awesome-repositories.com/f/data-databases/index-construction/full-text-inverted-indexes.md) — Implements inverted indexes that map individual words to documents to enable efficient full-text retrieval.
- [Full-Text Search Engines](https://awesome-repositories.com/f/data-databases/full-text-search-engines.md) — Provides a full-text search engine for indexing and retrieving text-based content across large datasets.
- [Geospatial Search](https://awesome-repositories.com/f/data-databases/geospatial-search.md) — Implements advanced querying of points and polygons to find documents based on geographic coordinates and spatial relations. ([source](https://lucene.apache.org/core/corenews.html))
- [Linguistic Analysis Frameworks](https://awesome-repositories.com/f/data-databases/linguistic-analysis-frameworks.md) — Provides a framework for applying language-specific processing such as tokenization and stemming during indexing.
- [Multilingual Text Processing](https://awesome-repositories.com/f/data-databases/multilingual-text-processing.md) — Handles language-specific tokenization, stemming, and normalization to ensure accurate search results across different languages.
- [Search Index Management](https://awesome-repositories.com/f/data-databases/search-index-management.md) — Provides tools for configuring, querying, and maintaining search indices through concurrent indexing and segment merging. ([source](https://lucene.apache.org/core/corenews.html))
- [Complex Search Querying](https://awesome-repositories.com/f/data-databases/search-indexing/complex-search-querying.md) — Implements advanced query mechanisms using a variety of search operators for retrieving filtered and aggregated datasets. ([source](https://cdn.jsdelivr.net/gh/apache/lucene-solr@master/README.md))
- [HNSW Indexes](https://awesome-repositories.com/f/data-databases/vector-indexing/hnsw-indexes.md) — Uses Hierarchical Navigable Small World indexes for fast approximate nearest-neighbor search on high-dimensional vector data.
- [Vector Similarity Search](https://awesome-repositories.com/f/data-databases/vector-similarity-search.md) — Performs high-dimensional vector similarity search to enable semantic retrieval and AI-driven content discovery. ([source](https://lucene.apache.org/core/corenews.html))
- [Asynchronous Segment Merging](https://awesome-repositories.com/f/data-databases/high-throughput-indexing/asynchronous-segment-merging.md) — Organizes index data into immutable chunks and uses background merging to optimize search performance and reclaim space.
- [Columnar Formats](https://awesome-repositories.com/f/data-databases/in-memory-data-stores/columnar-formats.md) — Provides field data in a column-oriented format to enable fast sorting, faceting, and aggregation.
- [Term Highlighting](https://awesome-repositories.com/f/data-databases/result-set-search/result-renderers/search-result-formatters/search-hit-renderers/term-highlighting.md) — Marks matching terms within documents using index offsets to show exactly where search terms appear. ([source](https://lucene.apache.org/core/corenews.html))
- [Faceted Search Engines](https://awesome-repositories.com/f/data-databases/search-indexing-technologies/search-indexing/search-and-indexing/search-interface-components/faceted-navigation/faceted-search-engines.md) — Provides backend logic for aggregating document counts into categories to enable faceted navigation. ([source](https://lucene.apache.org/core/corenews.html))
- [Spellchecking](https://awesome-repositories.com/f/data-databases/search-indexing/complex-search-querying/spellchecking.md) — Identifies misspelled words in a query and suggests alternatives based on the indexed content. ([source](https://lucene.apache.org/))
- [Index Memory Management](https://awesome-repositories.com/f/data-databases/search-indexing/index-memory-management.md) — Optimizes memory and disk consumption using sparse doc values and columnar batch indexing. ([source](https://lucene.apache.org/core/corenews.html))
- [Search Result Categorizers](https://awesome-repositories.com/f/data-databases/search-result-aggregators/search-result-categorizers.md) — Organizes search output using faceting and joins to allow users to narrow findings by specific attributes. ([source](https://lucene.apache.org/core/))
- [Search Result Filtering](https://awesome-repositories.com/f/data-databases/search-result-filtering.md) — Refines search hits using numeric ranges and keyword filters to narrow down relevant content. ([source](https://lucene.apache.org/core/corenews.html))
- [Search Suggestions](https://awesome-repositories.com/f/data-databases/search-suggestions.md) — Implements prefix-based search suggestions and typo-tolerant autocomplete to help users discover correct search terms.
- [Ranked Query Completions](https://awesome-repositories.com/f/data-databases/search-suggestions/ranked-query-completions.md) — Offers typo-tolerant query completions as users type to help discover relevant search terms quickly. ([source](https://lucene.apache.org/core/))
- [Vector Quantization](https://awesome-repositories.com/f/data-databases/vector-quantization.md) — Compresses high-precision vectors into lower-bit representations to reduce memory overhead.

### Artificial Intelligence & ML

- [Text Tokenization](https://awesome-repositories.com/f/artificial-intelligence-ml/natural-language-processing/text-tokenization.md) — Breaks raw text into searchable tokens using linguistic rules to improve search accuracy. ([source](https://lucene.apache.org/))

### Part of an Awesome List

- [Enterprise Search](https://awesome-repositories.com/f/awesome-lists/data/enterprise-search.md) — Provides search infrastructure for indexing and retrieval across diverse datasets with support for faceting and grouping.
- [Databases and Storage](https://awesome-repositories.com/f/awesome-lists/data/databases-and-storage.md) — Search engine library and server for full-text indexing.
- [Search Engines](https://awesome-repositories.com/f/awesome-lists/data/search-engines.md) — Enterprise search platform built on top of Lucene.

### User Interface & Experience

- [Relevance Scoring](https://awesome-repositories.com/f/user-interface-experience/search-result-ranking/relevance-scoring.md) — Modifies result rankings using static features, recency, and distance to calculate document relevance. ([source](https://lucene.apache.org/core/corenews.html))
- [Offset-Based Highlighting](https://awesome-repositories.com/f/user-interface-experience/syntax-highlighters/regex-based-highlighting/offset-based-highlighting.md) — Marks matching terms within documents using character position offsets from the index to highlight search hits.
