# unum-cloud/usearch

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/unum-cloud-usearch).**

3,888 stars · 286 forks · C++ · apache-2.0

## Links

- GitHub: https://github.com/unum-cloud/USearch
- Homepage: https://unum.cloud/usearch
- awesome-repositories: https://awesome-repositories.com/repository/unum-cloud-usearch.md

## Topics

`approximate-nearest-neighbor-search` `clustering` `database` `faiss` `full-text-search` `fuzzy-search` `image-search` `kann` `nearest-neighbor-search` `recommender-system` `search` `search-engine` `semantic-search` `simd` `similarity-search` `text-search` `vector-search` `webassembly`

## Description

USearch is a high-performance vector similarity search engine and approximate nearest neighbor index designed for dense embeddings. It functions as a low-level vector database core and high-dimensional vector indexer, providing the primitives necessary to store and retrieve vectors across massive datasets.

The engine distinguishes itself through hardware-level SIMD acceleration for distance kernels and a proximity-graph indexing system that enables fast retrieval across billions of vectors. It supports multi-precision vector quantization to balance memory usage and accuracy, and utilizes memory-mapped index persistence to reduce RAM overhead during loading and serialization.

The project covers a broad range of capabilities including exact brute-force linear scans, batch processing for bulk similarity searches, and thread-safe concurrent index construction. It implements multiple distance metrics—such as Euclidean, Hamming, Jaccard, and Haversine for geospatial proximity—while allowing for the integration of custom user-defined metric functions.

Additional utility surfaces include vector data clustering, semantic data joining, and tools for benchmarking search performance and accuracy evaluation.

## Tags

### Data & Databases

- [Approximate Nearest Neighbor Search](https://awesome-repositories.com/f/data-databases/approximate-nearest-neighbor-search.md) — Ships a proximity-graph based indexing system for fast approximate nearest neighbor searches across massive datasets. ([source](https://unum-cloud.github.io/USearch/python/reference.html))
- [Graph-Based Indexing](https://awesome-repositories.com/f/data-databases/graph-based-indexing.md) — Organizes high-dimensional vectors into a graph structure to enable fast approximate nearest neighbor search via traversal.
- [Vector Similarity Search](https://awesome-repositories.com/f/data-databases/vector-similarity-search.md) — Implements a high-performance engine for finding nearest neighbors in high-dimensional datasets using SIMD-accelerated distance metrics.
- [Proximity Graph Indexes](https://awesome-repositories.com/f/data-databases/approximate-nearest-neighbor-search/proximity-graph-indexes.md) — Uses a proximity-graph indexing system to enable fast retrieval across billions of vectors.
- [Index Construction](https://awesome-repositories.com/f/data-databases/index-construction.md) — Provides thread-safe concurrent insertion and pre-allocated memory management for constructing vector indices. ([source](https://unum-cloud.github.io/USearch/cpp))
- [Memory-Mapped Indexing](https://awesome-repositories.com/f/data-databases/memory-mapped-indexing.md) — Maps binary index files directly into the virtual address space to enable fast loading and low RAM overhead.
- [Serializable Indexes](https://awesome-repositories.com/f/data-databases/search-indexing-technologies/search-indexing/search-and-indexing/serializable-indexes.md) — Writes the proximity graph and metadata to a binary file and reads it back without requiring a full memory load. ([source](https://unum-cloud.github.io/USearch/cpp/reference.html))
- [Similarity Search](https://awesome-repositories.com/f/data-databases/similarity-search.md) — Implements both approximate graph-based and exact SIMD-optimized similarity search methods. ([source](https://unum-cloud.github.io/USearch/))
- [Similarity Search Engines](https://awesome-repositories.com/f/data-databases/similarity-search-engines.md) — Provides a computational engine optimized for identifying nearest neighbors in high-dimensional spaces using advanced indexing.
- [Vector Databases](https://awesome-repositories.com/f/data-databases/vector-databases.md) — Functions as a low-level storage and retrieval system for dense embeddings with quantization and memory-mapping.
- [Vector Indexing](https://awesome-repositories.com/f/data-databases/vector-indexing.md) — Provides tools for creating and managing high-dimensional vector indexes with support for binary serialization and memory-mapped storage.
- [Vector Quantization](https://awesome-repositories.com/f/data-databases/vector-quantization.md) — Supports storing vectors in various bit-depths from double precision to 1-bit binary to balance accuracy and memory usage.
- [High-Dimensional Vector Compressors](https://awesome-repositories.com/f/data-databases/vector-quantization/high-dimensional-vector-compressors.md) — Reduces the memory footprint of vector collections through quantization and precision casting.
- [SIMD-Accelerated Arithmetic](https://awesome-repositories.com/f/data-databases/vectorized-arithmetic/simd-accelerated-arithmetic.md) — Computes vector similarities using hardware-level SIMD instructions for high-throughput numerical processing.
- [Vector Distance Kernels](https://awesome-repositories.com/f/data-databases/vectorized-arithmetic/simd-accelerated-arithmetic/vector-distance-kernels.md) — Ships hardware-level SIMD acceleration for Euclidean, cosine, and Hamming distance kernels.
- [Billion-Scale Vector Search](https://awesome-repositories.com/f/data-databases/approximate-nearest-neighbor-search/billion-scale-vector-search.md) — Scales similarity search to billions of vectors using custom integer types and optimized memory management. ([source](https://cdn.jsdelivr.net/gh/unum-cloud/usearch@main/README.md))
- [Batch Processing](https://awesome-repositories.com/f/data-databases/batch-processing.md) — Allows adding multiple concatenated vectors in a single operation to increase indexing throughput. ([source](https://unum-cloud.github.io/USearch/java))
- [Brute-Force Search Algorithms](https://awesome-repositories.com/f/data-databases/brute-force-search-algorithms.md) — Performs an exact search by calculating distances across every vector in a dataset using parallel priority queues.
- [Semantic Joins](https://awesome-repositories.com/f/data-databases/data-joins/semantic-joins.md) — Creates mappings between two large vector datasets using approximate similarity to perform fuzzy or semantic joins.
- [Geospatial Search](https://awesome-repositories.com/f/data-databases/geospatial-search.md) — Calculates real-world distances between spherical coordinates using Haversine metrics for location-based retrieval.
- [Quantized Optimizers](https://awesome-repositories.com/f/data-databases/memory-optimization-strategies/training-memory-optimizers/quantized-optimizers.md) — Implements floating-point and integer quantization to balance search accuracy and memory consumption. ([source](https://unum-cloud.github.io/USearch/java))
- [Parallel Index Lookups](https://awesome-repositories.com/f/data-databases/search-indexing-technologies/search-indexing/search-and-indexing/vector-search-indexes/parallel-index-lookups.md) — Supports simultaneous searching across multiple indices to handle workloads containing trillions of vectors. ([source](https://unum-cloud.github.io/USearch/))
- [User-Defined Functions](https://awesome-repositories.com/f/data-databases/user-defined-functions.md) — Allows the integration of custom compiled functions or assembly code to implement specialized distance calculations.
- [Vector Index Compression](https://awesome-repositories.com/f/data-databases/vector-indexing/vector-index-compression.md) — Reduces memory footprint by storing vectors as half-precision floats, 8-bit integers, or packed binary formats. ([source](https://unum-cloud.github.io/USearch/rust/index.html))
- [Batch Vector Queries](https://awesome-repositories.com/f/data-databases/vector-search/batch-vector-queries.md) — Enables simultaneous querying of multiple vectors using tensors to increase throughput for bulk workloads. ([source](https://unum-cloud.github.io/USearch/python))
- [Brute Force Linear Scans](https://awesome-repositories.com/f/data-databases/vector-similarity-search/brute-force-linear-scans.md) — Provides exact vector search via brute-force linear scans for guaranteed accuracy on smaller datasets. ([source](https://unum-cloud.github.io/USearch/golang))
- [User-Defined Distance Metrics](https://awesome-repositories.com/f/data-databases/vector-similarity-search/user-defined-distance-metrics.md) — Supports the integration of custom compiled functions or assembly code to implement specialized distance calculations. ([source](https://cdn.jsdelivr.net/gh/unum-cloud/usearch@main/README.md))
- [Vectorized Data Processing](https://awesome-repositories.com/f/data-databases/vectorized-data-processing.md) — Processes multiple query vectors simultaneously using flattened arrays to maximize throughput for bulk similarity searches.

### Scientific & Mathematical Computing

- [Vector Distance Metrics](https://awesome-repositories.com/f/scientific-mathematical-computing/vector-distance-metrics.md) — Provides SIMD-accelerated mathematical primitives for calculating distances between high-dimensional vectors. ([source](https://unum-cloud.github.io/USearch/c))

### Software Engineering & Architecture

- [Thread-Safe Index Construction](https://awesome-repositories.com/f/software-engineering-architecture/thread-safe-index-construction.md) — Utilizes a thread-safe processing engine to run multiple additions and searches across different threads simultaneously. ([source](https://unum-cloud.github.io/USearch/java))
- [Concurrent Index Construction](https://awesome-repositories.com/f/software-engineering-architecture/background-thread-dispatchers/thread-safe-dispatchers/concurrent-index-construction.md) — Uses a multi-threaded execution engine to distribute vector additions and index construction across multiple CPU cores.

### Development Tools & Productivity

- [Performance Benchmarks](https://awesome-repositories.com/f/development-tools-productivity/performance-optimization-tools/performance-benchmarks.md) — Includes tools for tracking search efficiency and operations per second to analyze index performance. ([source](https://unum-cloud.github.io/USearch/python/reference.html))

### Testing & Quality Assurance

- [Search Accuracy Evaluation](https://awesome-repositories.com/f/testing-quality-assurance/search-accuracy-evaluation.md) — Measures index quality by comparing approximate results against exact ground-truth searches using recall metrics. ([source](https://unum-cloud.github.io/USearch/python))

### Part of an Awesome List

- [Machine Learning](https://awesome-repositories.com/f/awesome-lists/ai/machine-learning.md) — Fast search and clustering for vectors.
- [Database Systems](https://awesome-repositories.com/f/awesome-lists/data/database-systems.md) — Similarity search engine for vectors and strings.
