# facebookresearch/faiss

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/facebookresearch-faiss).**

39,135 stars · 4,244 forks · C++ · mit

## Links

- GitHub: https://github.com/facebookresearch/faiss
- Homepage: https://faiss.ai
- awesome-repositories: https://awesome-repositories.com/repository/facebookresearch-faiss.md

## Description

This project is a high-performance library designed for the similarity search and clustering of dense vectors across massive datasets. It functions as a vector similarity search engine, providing the necessary tools to organize complex numerical data into specialized structures that facilitate rapid retrieval and efficient querying of millions of records.

The library distinguishes itself through a variety of advanced indexing and compression techniques, including hierarchical navigable small worlds for logarithmic time complexity and inverted file indexing to partition vector spaces into manageable subsets. To handle large-scale data, it employs product quantization to reduce memory footprints and utilizes hardware-level vector instructions to accelerate mathematical operations. For scenarios requiring absolute precision, the system also supports exhaustive brute-force search methods.

Beyond its core indexing capabilities, the library provides a comprehensive framework for the end-to-end vector search workflow, from the initial formatting of floating-point data into row-major matrices to the execution of nearest-neighbor retrieval. It includes support for memory-mapped index storage, allowing for the management of datasets that exceed physical memory capacity, and serves as a foundation for machine learning feature retrieval tasks.

## Tags

### Data & Databases

- [Vector Search Engines](https://awesome-repositories.com/f/data-databases/vector-search-engines.md) — Provides a high-performance library for efficient similarity search and clustering of dense vectors across massive datasets.
- [Approximate Nearest Neighbor Search](https://awesome-repositories.com/f/data-databases/approximate-nearest-neighbor-search.md) — Optimizes search speed by trading off absolute precision for significantly faster lookup times when querying extremely large vector databases.
- [Vector Similarity Search](https://awesome-repositories.com/f/data-databases/vector-similarity-search.md) — Enables identifying nearest neighbors in large datasets using distance metrics like Euclidean or inner product for fast and accurate results. ([source](https://faiss.ai))
- [High-Performance Vector Indexing](https://awesome-repositories.com/f/data-databases/high-performance-vector-indexing.md) — Organizes complex numerical data into specialized structures that allow for rapid retrieval and efficient querying across millions of records.
- [Similarity Search](https://awesome-repositories.com/f/data-databases/similarity-search.md) — Finds the most relevant items in massive datasets by comparing mathematical representations of data points based on their proximity.
- [Graph-Based Indexing](https://awesome-repositories.com/f/data-databases/graph-based-indexing.md) — Constructs a multi-layered graph structure that allows logarithmic time complexity for finding approximate nearest neighbors in high-dimensional space.
- [Vector Indices](https://awesome-repositories.com/f/data-databases/vector-indices.md) — Organizes high-dimensional vectors to enable rapid retrieval of relevant items without exhaustive linear scanning.
- [K-Nearest Neighbor Retrieval](https://awesome-repositories.com/f/data-databases/k-nearest-neighbor-retrieval.md) — Executes searches against an indexed dataset to retrieve unique identifiers and distance scores for the most relevant vectors. ([source](https://github.com/facebookresearch/faiss/wiki/Getting-started))
- [Product Quantization](https://awesome-repositories.com/f/data-databases/product-quantization.md) — Reduces memory footprint by decomposing high-dimensional vectors into smaller sub-vectors represented by compact codebook indices.
- [Inverted File Indices](https://awesome-repositories.com/f/data-databases/inverted-file-indices.md) — Partitions the vector space into Voronoi cells to limit search scope to a small subset of the total database.

### Artificial Intelligence & ML

- [Embedding Retrieval](https://awesome-repositories.com/f/artificial-intelligence-ml/embedding-retrieval.md) — Accesses and matches high-dimensional embeddings generated by neural networks to support real-time recommendations and pattern recognition tasks.

### Hardware & IoT

- [GPU & Performance](https://awesome-repositories.com/f/hardware-iot/integration-performance/gpu-performance.md) — Utilizes hardware-level vector instructions to perform rapid mathematical operations on floating point data during similarity search.

### Scientific & Mathematical Computing

- [Numerical Libraries](https://awesome-repositories.com/f/scientific-mathematical-computing/numerical-mathematical-foundations/numerical-libraries.md) — Provides a collection of optimized algorithms designed for fast matrix operations and distance calculations on large-scale floating point data sets.
