Annoy is a C++ library designed for approximate nearest neighbor search in high-dimensional vector spaces. It functions as a vector similarity search engine that constructs static, disk-based data structures to facilitate fast lookups. By mapping identifiers to vector data and persisting these structures to disk, the library enables efficient, memory-mapped access to large datasets.
The project distinguishes itself through the use of random projection trees and distance-metric-based partitioning, which organize data into hierarchical binary trees to balance search precision against computational overhead. Because the resulting indices are immutable and memory-mapped, they can be shared across multiple independent system processes without requiring the entire dataset to reside in active memory.
The library supports a broad range of indexing and retrieval capabilities, including the ability to scale to datasets that exceed available system memory. It provides cross-language integration through generated bindings and standard build system support, allowing the core search engine to be utilized across various programming environments.