Qdrant

Qdrant is a high-performance vector similarity database designed to store, index, and search high-dimensional vectors alongside structured metadata. It functions as a distributed search engine that manages large-scale data clusters, providing low-latency retrieval and complex filtering capabilities. The system is built to serve as a specialized middleware layer, connecting machine learning pipelines and AI agents to persistent storage for intelligent information retrieval and recommendation tasks.

The platform distinguishes itself through advanced retrieval techniques, including support for hybrid search that combines dense and sparse vectors, and multivector search that utilizes late interaction models for high-accuracy relevance scoring. It provides robust multi-tenant data isolation, allowing organizations to partition records and manage resources securely within a single cluster. To maintain performance at scale, the engine employs a segment-based storage architecture with asynchronous background optimization, ensuring that indexing and compaction processes do not block incoming queries.

The system covers a broad capability surface, including comprehensive metadata filtering, geospatial search, and full-text indexing. It supports production-grade operations through distributed consensus protocols, write-ahead logging for durability, and memory-mapped indexing for efficient resource utilization. Administrative features include atomic collection aliasing, point-in-time snapshotting, and integrated tools for metric learning and search recall tuning.

The project provides standardized REST and gRPC interfaces, supported by official client libraries for various programming environments. It is designed for flexible deployment, offering support for containerized local execution, Kubernetes-based production scaling, and infrastructure-as-code management via Terraform.

Features

Vector Databases - Stores, indexes, and searches high-dimensional vectors alongside structured metadata for intelligent retrieval applications.
Vector Search Engines - Builds high-performance search applications that find relevant information by comparing mathematical representations of data.
Vector Search Engines - Manages high-dimensional vectors in production environments with associated metadata for efficient similarity search.
Hybrid Search Engines - Combines keyword-based retrieval with semantic search for comprehensive results.
Vector Indexing - Retrieves data quickly by indexing vectors using graph-based algorithms.
Vector Storage - Supports in-memory and memory-mapped storage for efficient vector data handling.
Vector Search - Identifies records by constraining search spaces using positive and negative vector pairs.
Distributed Search Engines - Provides a scalable infrastructure platform that manages large-scale data clusters for low-latency similarity search.
Hybrid Search - Combines approximate nearest neighbor algorithms with boolean metadata filtering to narrow search results.
Multivector Search - Retrieves documents with high accuracy using multiple token-level vectors per document.
Agent Memory Systems - Provides persistent storage for autonomous agents to retrieve context and past experiences during reasoning tasks.
Recommendation Engines - Suggests relevant records by calculating search spaces from provided example vectors.
Multi-Tenant Data Stores - Provides a robust storage architecture that isolates data partitions and metadata for multiple users while maintaining consistent performance.
Vector Collection Management - Organizes vectors into collections with shared dimensionality and distance metrics.
Write-Ahead Logging - Records all incoming data modifications in a sequential log to guarantee durability.
Distributed Consensus Protocols - Coordinates state changes across cluster nodes to ensure data consistency and high availability.
Managed Database Services - Automates infrastructure scaling, backups, and performance monitoring for managed database clusters.
Identity and Access Management - Manages account-level security and user membership within the cloud environment.
Transport Security - Secures data transmission between applications and servers using TLS encryption.
Metric Learning - Enables training of metric learning models to measure similarity between objects without requiring manually labeled datasets.
Natural Language Processing - High-performance vector database for similarity search.
Retrieval Augmented Generation - Open-source vector similarity search engine with filtering.
Data Management - Vector similarity search engine with filtering support.
Database Systems - High-performance vector search engine for AI-driven applications.
Databases and RAG - Vector database for AI applications.
Databases & Data - Vector similarity search engine with advanced filtering.
Enterprise Search - Vector similarity search engine with filtering.
Vector Databases - Vector similarity search engine with extended filtering support.
Enterprise Search - Vector similarity search engine with filtering support.
High-Throughput Indexing - Organizes massive volumes of unstructured data into searchable structures for rapid retrieval in production environments.
Memory-Mapped Indexing - Maps vector data directly into virtual address space for efficient access to large datasets.
Metadata Filtering - Restricts search results by matching specific values, sets, or text patterns including range and null-check conditions.
Query Interfaces - Executes similarity searches and random sampling tasks using a unified interface supporting filtering and pagination.
Segmented Storage Architectures - Partitions data into immutable segments that are merged in the background to optimize performance.
Semantic Search Engines - Retrieves relevant documents from a collection by embedding text queries on the fly using integrated machine learning models.
Production Deployment Tools - Supports deployment using managed cloud services, Kubernetes operators, or Helm charts for production-grade scaling.
API Authentication - Restricts database operations to specific access levels using secure API keys.
Identity and Access Management - Controls access to database API keys, cluster backups, and cluster metadata to secure managed cloud infrastructure.
AI Orchestration Frameworks - Embeds vector search capabilities into intelligent agents and orchestration tools to build production-ready applications.
Backup & Recovery - Provides automated snapshot creation and restoration to ensure data persistence and recovery.
Collection Schemas - Allows specification of dimensionality, distance metrics, and data types for optimized storage.
Full Text Search - Improves matching logic by searching for specific substrings, tokens, or phrases within text fields.
Multi-Tenancy Architectures - Manages large-scale datasets for multiple users while ensuring strict privacy and resource separation.
Storage Configuration - Configures in-memory or disk-based persistence to optimize filtering performance.
Vector Quantization - Applies scalar quantization to multi-vector representations to reduce memory usage.
Vector Search Middleware - Connects machine learning pipelines and AI agents to persistent storage for rapid information retrieval.
Client Libraries - Simplifies data operations and ensures reliable communication using official language-specific drivers.
Access Control Lists - Grants read or write access to cluster data via the management interface.
Cloud Authentication - Provides secure access to cloud resources via management keys in API headers.
Embedding Model Adaptations - Adapts standard dense embedding models for late interaction to improve retrieval performance.
Boolean Query Languages - Refines search criteria by creating complex expressions using nested logical operators like AND, OR, and NOT.
Data Partitioning - Isolates data by user or group using metadata to ensure tenant privacy.
Data Snapshotting - Backs up data by creating and downloading collection snapshots from individual cluster nodes.
Data Upsert Operations - Inserts new records or updates existing ones using unique identifiers to ensure data remains current.
Geospatial Databases - Enables filtering and querying of location-based information by combining coordinate constraints with vector similarity.
Multi-Vector Retrieval Systems - Ranks and retrieves highly relevant documents by combining dense semantic embeddings with fine-grained late-interaction embeddings.
Multitenancy Isolation - Partitions records to ensure strict resource separation and data privacy.
Storage Engines - Optimizes storage performance by converting mutable data segments into immutable structures for hardware-level efficiency.
Cloud Infrastructure Automation - Automates cloud platform resources, including account management, cluster provisioning, and backup scheduling.
Binary Communication Protocols - Uses high-performance binary serialization for internal and external data exchange.
gRPC Interfaces - Maximizes application throughput and minimizes latency using a high-performance binary protocol.
Awesome List - Balances retrieval recall against query latency by tuning search-time parameters like graph exploration depth.
Credential Management - Generates database API keys with configurable expiration and granular permissions.

Star history

qdrantqdrant

Name: qdrant/qdrant
Author: qdrant

View on GitHub

32,372 stars2,384 forksRustApache-2.014 viewsqdrant.tech

Qdrant

Features

Vector Databases - Stores, indexes, and searches high-dimensional vectors alongside structured metadata for intelligent retrieval applications.
Vector Search Engines - Builds high-performance search applications that find relevant information by comparing mathematical representations of data.
Vector Search Engines - Manages high-dimensional vectors in production environments with associated metadata for efficient similarity search.
Hybrid Search Engines - Combines keyword-based retrieval with semantic search for comprehensive results.
Vector Indexing - Retrieves data quickly by indexing vectors using graph-based algorithms.
Vector Storage - Supports in-memory and memory-mapped storage for efficient vector data handling.
Vector Search - Identifies records by constraining search spaces using positive and negative vector pairs.
Distributed Search Engines - Provides a scalable infrastructure platform that manages large-scale data clusters for low-latency similarity search.
Hybrid Search - Combines approximate nearest neighbor algorithms with boolean metadata filtering to narrow search results.
Multivector Search - Retrieves documents with high accuracy using multiple token-level vectors per document.
Agent Memory Systems - Provides persistent storage for autonomous agents to retrieve context and past experiences during reasoning tasks.
Recommendation Engines - Suggests relevant records by calculating search spaces from provided example vectors.
Multi-Tenant Data Stores - Provides a robust storage architecture that isolates data partitions and metadata for multiple users while maintaining consistent performance.
Vector Collection Management - Organizes vectors into collections with shared dimensionality and distance metrics.
Write-Ahead Logging - Records all incoming data modifications in a sequential log to guarantee durability.
Distributed Consensus Protocols - Coordinates state changes across cluster nodes to ensure data consistency and high availability.
Managed Database Services - Automates infrastructure scaling, backups, and performance monitoring for managed database clusters.
Identity and Access Management - Manages account-level security and user membership within the cloud environment.
Transport Security - Secures data transmission between applications and servers using TLS encryption.
Metric Learning - Enables training of metric learning models to measure similarity between objects without requiring manually labeled datasets.
Natural Language Processing - High-performance vector database for similarity search.
Retrieval Augmented Generation - Open-source vector similarity search engine with filtering.
Data Management - Vector similarity search engine with filtering support.
Database Systems - High-performance vector search engine for AI-driven applications.
Databases and RAG - Vector database for AI applications.
Databases & Data - Vector similarity search engine with advanced filtering.
Enterprise Search - Vector similarity search engine with filtering.
Vector Databases - Vector similarity search engine with extended filtering support.
Enterprise Search - Vector similarity search engine with filtering support.
High-Throughput Indexing - Organizes massive volumes of unstructured data into searchable structures for rapid retrieval in production environments.
Memory-Mapped Indexing - Maps vector data directly into virtual address space for efficient access to large datasets.
Metadata Filtering - Restricts search results by matching specific values, sets, or text patterns including range and null-check conditions.
Query Interfaces - Executes similarity searches and random sampling tasks using a unified interface supporting filtering and pagination.
Segmented Storage Architectures - Partitions data into immutable segments that are merged in the background to optimize performance.
Semantic Search Engines - Retrieves relevant documents from a collection by embedding text queries on the fly using integrated machine learning models.
Production Deployment Tools - Supports deployment using managed cloud services, Kubernetes operators, or Helm charts for production-grade scaling.
API Authentication - Restricts database operations to specific access levels using secure API keys.
Identity and Access Management - Controls access to database API keys, cluster backups, and cluster metadata to secure managed cloud infrastructure.
AI Orchestration Frameworks - Embeds vector search capabilities into intelligent agents and orchestration tools to build production-ready applications.
Backup & Recovery - Provides automated snapshot creation and restoration to ensure data persistence and recovery.
Collection Schemas - Allows specification of dimensionality, distance metrics, and data types for optimized storage.
Full Text Search - Improves matching logic by searching for specific substrings, tokens, or phrases within text fields.
Multi-Tenancy Architectures - Manages large-scale datasets for multiple users while ensuring strict privacy and resource separation.
Storage Configuration - Configures in-memory or disk-based persistence to optimize filtering performance.
Vector Quantization - Applies scalar quantization to multi-vector representations to reduce memory usage.
Vector Search Middleware - Connects machine learning pipelines and AI agents to persistent storage for rapid information retrieval.
Client Libraries - Simplifies data operations and ensures reliable communication using official language-specific drivers.
Access Control Lists - Grants read or write access to cluster data via the management interface.
Cloud Authentication - Provides secure access to cloud resources via management keys in API headers.
Embedding Model Adaptations - Adapts standard dense embedding models for late interaction to improve retrieval performance.
Boolean Query Languages - Refines search criteria by creating complex expressions using nested logical operators like AND, OR, and NOT.
Data Partitioning - Isolates data by user or group using metadata to ensure tenant privacy.
Data Snapshotting - Backs up data by creating and downloading collection snapshots from individual cluster nodes.
Data Upsert Operations - Inserts new records or updates existing ones using unique identifiers to ensure data remains current.
Geospatial Databases - Enables filtering and querying of location-based information by combining coordinate constraints with vector similarity.
Multi-Vector Retrieval Systems - Ranks and retrieves highly relevant documents by combining dense semantic embeddings with fine-grained late-interaction embeddings.
Multitenancy Isolation - Partitions records to ensure strict resource separation and data privacy.
Storage Engines - Optimizes storage performance by converting mutable data segments into immutable structures for hardware-level efficiency.
Cloud Infrastructure Automation - Automates cloud platform resources, including account management, cluster provisioning, and backup scheduling.
Binary Communication Protocols - Uses high-performance binary serialization for internal and external data exchange.
gRPC Interfaces - Maximizes application throughput and minimizes latency using a high-performance binary protocol.
Awesome List - Balances retrieval recall against query latency by tuning search-time parameters like graph exploration depth.
Credential Management - Generates database API keys with configurable expiration and granular permissions.

Open-source alternatives to Qdrant

Similar open-source projects, ranked by how many features they share with Qdrant.

lancedb/lancedb
lancedb/lancedb
9,031View on GitHub
LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The system distinguishes itself through a combination of cloud-native object storage and immutable version tracking, allowing for data time-travel and reproducible AI experiments. It integrates hybrid search capabilities, merging dense vector similarity with BM25 full-text search and SQL-like scalar filters
HTMLapproximate-nearest-neighbor-searchimage-searchnearest-neighbor-search
View on GitHub9,031
alibaba/zvec
alibaba/zvec
5,198View on GitHub
zvec is an embedded vector database engine and indexing library designed for high-dimensional similarity search. It functions as a hybrid search engine and a retrieval-augmented generation knowledge base, allowing for the storage and retrieval of dense and sparse vectors. The system is distinguished by its hybrid retrieval pipeline, which fuses vector similarity, full-text keyword matching, and scalar metadata filtering into single query operations. It supports a plugin-based model integration system for registering custom embedding models and rerankers, as well as language bindings for nativ
C++ann-searchembedded-databaserag
View on GitHub5,198
chroma-core/chroma
chroma-core/chroma
26,198View on GitHub
Chroma is a specialized vector database designed to index and retrieve high-dimensional data representations for semantic similarity search. It functions as a comprehensive platform for information retrieval, enabling the storage and management of unstructured documents alongside structured metadata. By mapping data into numerical representations, the system facilitates rapid similarity lookups across large datasets. The platform distinguishes itself through a hybrid search infrastructure that combines dense vector embeddings with sparse keyword and regular expression matching to balance sema
Rustaidatabasedocument-retrieval
View on GitHub26,198
milvus-io/milvus
milvus-io/milvus
44,804View on GitHub
Milvus is a specialized vector database engine designed for the indexing, management, and high-speed similarity retrieval of high-dimensional vector embeddings. It functions as a similarity search engine capable of identifying nearest neighbors within large-scale vector spaces, supporting the storage and retrieval of billions of data points while maintaining consistent performance. The system utilizes a distributed architecture that decouples storage, query, and coordination into independent services, allowing for horizontal scaling across clusters. It employs a global indexing mechanism that
Goannscloud-nativediskann
View on GitHub44,804

See all 30 alternatives to Qdrant

Frequently asked questions

What does qdrant/qdrant do?

What are the main features of qdrant/qdrant?

The main features of qdrant/qdrant are: Vector Databases, Vector Search Engines, Hybrid Search Engines, Vector Indexing, Vector Storage, Vector Search, Distributed Search Engines, Hybrid Search.

What are some open-source alternatives to qdrant/qdrant?

Open-source alternatives to qdrant/qdrant include: lancedb/lancedb — LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector… alibaba/zvec — zvec is an embedded vector database engine and indexing library designed for high-dimensional similarity search. It… chroma-core/chroma — Chroma is a specialized vector database designed to index and retrieve high-dimensional data representations for… milvus-io/milvus — Milvus is a specialized vector database engine designed for the indexing, management, and high-speed similarity… semi-technologies/weaviate — Weaviate is a cloud-native vector database and distributed vector store designed to save high-dimensional vectors… surrealdb/surrealdb — SurrealDB is a multi-model database engine designed to store and query document, graph, relational, and vector data…