# Vector Databases for Machine Embeddings

> Search results for `vector database for storing and searching embeddings` on awesome-repositories.com. 114 total matches; showing the first 50.

Explore on the web: https://awesome-repositories.com/q/vector-database-for-storing-and-searching-embeddings

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [this search on awesome-repositories.com](https://awesome-repositories.com/q/vector-database-for-storing-and-searching-embeddings).**

## Results

- [milvus-io/milvus](https://awesome-repositories.com/repository/milvus-io-milvus.md) (44,804 ⭐) — Milvus is a specialized vector database engine designed for the indexing, management, and high-speed similarity retrieval of high-dimensional vector embeddings. It functions as a similarity search engine capable of identifying nearest neighbors within large-scale vector spaces, supporting the storage and retrieval of billions of data points while maintaining consistent performance.

The system utilizes a distributed architecture that decouples storage, query, and coordination into independent services, allowing for horizontal scaling across clusters. It employs a global indexing mechanism that
- [cve-search/cve-search](https://awesome-repositories.com/repository/cve-search-cve-search.md) (2,593 ⭐) — cve-search is a vulnerability search engine and database manager designed to index, synchronize, and query CVE and CPE security vulnerability data. It functions as a security data warehouse that imports vulnerability feeds into a local database to enable fast, keyword-based discovery of security flaws.

The project provides a web-based vulnerability browser and a programmatic JSON API for retrieving records and risk scores. It utilizes full-text indexing for vulnerability descriptions and implements an identity-verified security portal using the OpenID Connect standard for user authentication.
- [flowiseai/flowise](https://awesome-repositories.com/repository/flowiseai-flowise.md) (53,641 ⭐) — Flowise is a low-code platform designed for building and deploying complex language model workflows through a visual, node-based interface. It functions as an orchestrator for autonomous multi-agent systems, allowing users to construct conversational pipelines by connecting language models, memory stores, and external tools on a drag-and-drop canvas.

The platform distinguishes itself through its support for sophisticated agentic patterns, including supervisor-worker delegation and iterative reasoning strategies. Users can design directed acyclic graphs to manage conditional branching, state p
- [pingcap/tidb](https://awesome-repositories.com/repository/pingcap-tidb.md) (40,166 ⭐) — TiDB is a horizontally scalable, distributed SQL database designed to provide consistent transactional storage and high-performance analytical processing within a single unified architecture. It utilizes a decoupled compute-storage design and a distributed key-value storage layer to ensure horizontal scalability and efficient range-based queries. By employing a consensus-based replication algorithm, the system maintains high availability and automatic failover across multiple nodes and geographical regions.

The platform distinguishes itself through its hybrid transactional and analytical proc
- [qdrant/qdrant](https://awesome-repositories.com/repository/qdrant-qdrant.md) (32,372 ⭐) — Qdrant is a high-performance vector similarity database designed to store, index, and search high-dimensional vectors alongside structured metadata. It functions as a distributed search engine that manages large-scale data clusters, providing low-latency retrieval and complex filtering capabilities. The system is built to serve as a specialized middleware layer, connecting machine learning pipelines and AI agents to persistent storage for intelligent information retrieval and recommendation tasks.

The platform distinguishes itself through advanced retrieval techniques, including support for h
- [oramasearch/orama](https://awesome-repositories.com/repository/oramasearch-orama.md) (10,436 ⭐) — Orama is a search engine and vector database that provides full-text indexing, geospatial calculations, and semantic vector storage. It functions as an LLM retrieval engine designed to provide grounded context to language models for conversational interfaces.

The project implements hybrid search by combining dense vector embeddings with inverted keyword indices to retrieve documents based on both semantic meaning and exact text matches. It utilizes a WebAssembly module to execute search logic across different JavaScript environments and platforms.

The system covers a broad range of retrieval
- [nomic-ai/gpt4all](https://awesome-repositories.com/repository/nomic-ai-gpt4all.md) (77,375 ⭐) — GPT4All is a cross-platform runtime environment designed to execute large language models directly on local consumer hardware. By leveraging an optimized C++ inference backend, it enables private, offline AI interactions without requiring an internet connection or external cloud services. The project provides a comprehensive ecosystem for managing the entire model lifecycle, including discovery, downloading, and configuration of local weights.

What distinguishes the platform is its integrated retrieval-augmented generation engine, which allows users to index local documents into semantic vect
- [embedding/chinese-word-vectors](https://awesome-repositories.com/repository/embedding-chinese-word-vectors.md) (12,227 ⭐) — This project is a collection of pre-trained dense and sparse word vectors trained on diverse Chinese corpora. It serves as a library of linguistic representations and an NLP vector dataset designed to improve the accuracy of semantic and morphological analysis in text models.

The collection provides corpus-specific representations and utilizes n-gram co-occurrence modeling to capture diverse linguistic patterns. It includes a hybrid of dense-sparse vectors to balance computational efficiency and semantic precision.

The project covers semantic vector search and the development of Chinese natu
- [mastra-ai/mastra](https://awesome-repositories.com/repository/mastra-ai-mastra.md) (21,221 ⭐) — Mastra is an orchestration framework designed for building, deploying, and managing autonomous AI agents and multi-agent systems. It provides a comprehensive suite of primitives for creating resilient AI applications, including durable workflow orchestration, event-driven agent loops, and semantic memory management. By integrating these core components, the platform enables developers to build complex, multi-step processes that can reason about goals and execute tasks without manual intervention.

The framework distinguishes itself through its focus on observability and secure, isolated execut
- [oceanbase/oceanbase](https://awesome-repositories.com/repository/oceanbase-oceanbase.md) (9,980 ⭐) — OceanBase is a distributed SQL database designed for high availability and strong consistency across multiple nodes and regions. It functions as a hybrid transactional and analytical processing engine, allowing real-time analytics and transactions to execute on a single data copy. The system also serves as a vector database engine for indexing and querying vector data to power semantic search and recommendation systems.

The platform features native compatibility layers for MySQL and Oracle, enabling the migration of legacy workloads without rewriting SQL code. It utilizes a Paxos-based distri
- [patricktrainer/duckdb-embedding-search](https://awesome-repositories.com/repository/patricktrainer-duckdb-embedding-search.md) (150 ⭐) — This repository contains a Python application that utilizes DuckDB as a backend to store and retrieve embedding vectors. The novel use of DuckDB allows for efficient similarity searches among large datasets. In this example, we've loaded comments from Hacker News and implemented functionality to…
- [rohitg00/agentmemory](https://awesome-repositories.com/repository/rohitg00-agentmemory.md) (23,785 ⭐) — AgentMemory is a persistent knowledge store and memory server designed to provide AI coding agents with long-term memory. It functions as a knowledge graph engine and vector database store that saves and recalls project context, architectural decisions, and patterns across different sessions.

The system distinguishes itself by using a tiered-memory consolidation pipeline that compresses raw observations into episodic, semantic, and procedural layers to optimize token usage. It employs a hybrid retrieval strategy combining keyword matching, vector embeddings, and graph traversal to surface rel
- [openai/openai-cookbook](https://awesome-repositories.com/repository/openai-openai-cookbook.md) (74,196 ⭐) — This project is a technical learning resource and developer knowledge base focused on the integration of large language models into software applications. It provides a structured collection of guides and code examples designed to teach developers how to implement intelligent features using proven patterns and best practices.

The repository distinguishes itself through a library of functional demonstrations that cover complex topics such as retrieval-augmented generation, function calling, and prompt engineering workflows. These materials are organized into a modular structure, allowing for t
- [mahyarmirrashed/search-and-replace.nvim](https://awesome-repositories.com/repository/mahyarmirrashed-search-and-replace-nvim.md) (7 ⭐) — Search and replace functionality in Neovim.
- [dizitart/nitrite-database](https://awesome-repositories.com/repository/dizitart-nitrite-database.md) (904 ⭐) — NoSQL embedded document store for Java
- [open-webui/open-webui](https://awesome-repositories.com/repository/open-webui-open-webui.md) (142,694 ⭐) — Open WebUI is a self-hosted, web-based platform designed for interacting with local and remote artificial intelligence models. It functions as a unified interface and orchestration suite, enabling users to build, deploy, and manage specialized AI agents equipped with custom instructions, external tool access, and private knowledge bases.

The platform distinguishes itself through a modular architecture that supports complex AI workflows. It features a plugin-based framework for custom logic and pipeline-based request processing, allowing developers to filter or transform data streams before th
- [microsoft/generative-ai-for-beginners](https://awesome-repositories.com/repository/microsoft-generative-ai-for-beginners.md) (112,045 ⭐) — This project is a comprehensive, open-source educational curriculum designed to guide developers through the mastery of generative artificial intelligence. It provides a structured learning path that covers foundational concepts, prompt engineering, and the practical application of large language models. The repository serves as a central hub for skill acquisition, offering sequential modules that progress from basic model mechanics to advanced architectural patterns.

The curriculum distinguishes itself by focusing on the end-to-end lifecycle of intelligent software, including the implementat
- [mongodb/mongo](https://awesome-repositories.com/repository/mongodb-mongo.md) (28,158 ⭐) — This project is a distributed, document-oriented database system designed to store information in flexible, hierarchical structures. It supports horizontal scaling through automated sharding and maintains high availability across global clusters using a multi-node replication protocol. By executing multi-document operations as atomic units, the system ensures data integrity and consistency across distributed environments.

The platform distinguishes itself by integrating advanced vector-based indexing, which enables semantic similarity searches alongside traditional geospatial and lexical quer
- [s1n7ax/nvim-search-and-replace](https://awesome-repositories.com/repository/s1n7ax-nvim-search-and-replace.md) (70 ⭐) — Really simple plugin to search and replace multiple files
- [arangodb/arangodb](https://awesome-repositories.com/repository/arangodb-arangodb.md) (14,091 ⭐) — This project is a multi-model database system designed to store and manage information as documents, graphs, and key-value pairs within a single engine. It functions as a graph database and knowledge graph platform, providing the infrastructure to build, query, and visualize structured data models. By integrating vector search capabilities, the system serves as a vector database that supports retrieval-augmented generation for artificial intelligence applications.

The platform distinguishes itself through a unified query language that allows users to perform document lookups, graph traversals
- [typesense/typesense](https://awesome-repositories.com/repository/typesense-typesense.md) (25,254 ⭐) — Typesense is a distributed search engine designed to provide sub-millisecond query latency across massive datasets. It functions as both a high-performance indexing and retrieval engine and a comprehensive search experience platform, offering built-in typo tolerance and tools for managing relevance through synonym configuration, result curation, and complex filtering.

The platform distinguishes itself by utilizing in-memory indexing to maintain high-throughput data retrieval and integrating vector database capabilities to support semantic similarity searches. It ensures data consistency and h
- [neo4j/neo4j](https://awesome-repositories.com/repository/neo4j-neo4j.md) (15,928 ⭐) — Neo4j is a native graph database management system designed to store and query highly connected data using a property-graph model. It provides an ACID-compliant transaction engine that ensures data integrity, supported by a distributed cluster architecture that maintains causal consistency across nodes. Users interact with the system through a declarative query language, which allows for complex pattern matching and path traversal without requiring manual traversal logic.

The platform distinguishes itself through its hybrid approach to data retrieval, combining traditional graph-based queries
- [vectorize-io/vectorize-mcp-server](https://awesome-repositories.com/repository/vectorize-io-vectorize-mcp-server.md) (108 ⭐) — Official Vectorize MCP Server
- [typeorm/typeorm](https://awesome-repositories.com/repository/typeorm-typeorm.md) (36,540 ⭐) — TypeORM is an object-relational mapper for TypeScript and JavaScript that bridges the gap between object-oriented application code and relational database tables. It provides a comprehensive data persistence layer that allows developers to define database entities using class decorators or configuration objects, enabling seamless interaction with data through object-oriented patterns.

The project distinguishes itself through a flexible architecture that supports both the data mapper and repository patterns, alongside a fluent query builder that translates high-level method calls into platform
- [greenrobot/eventbus](https://awesome-repositories.com/repository/greenrobot-eventbus.md) (24,760 ⭐) — EventBus is a publish-subscribe messaging library designed to facilitate decoupled communication between components in Java applications. It functions as a central hub where producers dispatch events that are routed to subscribers based on the class type of the payload. By using annotation-based markers, the system maps event handlers to specific data types, allowing different parts of an application to exchange information without requiring direct references between classes.

The library distinguishes itself through a focus on performance and execution control. It utilizes a compile-time inde
- [rayhollister/database-users-for-yourls](https://awesome-repositories.com/repository/rayhollister-database-users-for-yourls.md) (4 ⭐) — Database Users replaces the static credential array in user/config.php with a database-backed user table and a lightweight administration panel. Activate it to keep logins inside YOURLS, grant a password self-service form, and stay compatible with existing hashing schemes.
- [encode/databases](https://awesome-repositories.com/repository/encode-databases.md) (4,002 ⭐) — Async database support for Python. 🗄
- [dokploy/dokploy](https://awesome-repositories.com/repository/dokploy-dokploy.md) (34,901 ⭐) — Dokploy is a self-hosted platform-as-a-service designed to simplify the deployment and management of containerized applications and databases. It provides a centralized control plane that decouples administrative management from application workloads, allowing users to oversee infrastructure across multiple server nodes through a unified web interface or a command-line tool.

The platform distinguishes itself through an extensive library of pre-configured application templates, enabling the rapid deployment of databases, identity providers, and various productivity or development tools. It sup
- [embedded-graphics/embedded-graphics](https://awesome-repositories.com/repository/embedded-graphics-embedded-graphics.md) (1,295 ⭐) — A no_std graphics library for embedded applications
- [activeloopai/hub](https://awesome-repositories.com/repository/activeloopai-hub.md) (9,177 ⭐) — Hub is a multimodal AI data lake and vector database designed for storing and querying embeddings, text, audio, and images. It functions as a dataset version control system and a machine learning data streaming engine to support large-scale model training.

The system utilizes a serverless PostgreSQL vector store to index high-dimensional embeddings for semantic search. It provides a visual interface for inspecting multimodal datasets and viewing annotations such as bounding boxes and masks.

The platform handles cloud-agnostic storage synchronization and implements lazy, compressed data strea
- [tursodatabase/libsql](https://awesome-repositories.com/repository/tursodatabase-libsql.md) (16,887 ⭐) — LibSQL is a high-performance, distributed SQL database engine that extends SQLite to support remote network access, edge computing, and real-time synchronization. It functions as an embedded database library that integrates directly into application processes while providing the infrastructure to maintain consistency across multiple geographic regions.

The platform distinguishes itself by enabling database interaction over standard HTTP protocols, allowing applications to query remote data sources in serverless and edge environments without requiring local filesystem access. It includes nativ
- [n8n-io/n8n](https://awesome-repositories.com/repository/n8n-io-n8n.md) (192,772 ⭐) — n8n is a workflow automation platform that combines a visual interface with code-based extensibility to design, orchestrate, and manage automated processes. It provides a comprehensive suite of tools for data transformation, filtering, and storage, allowing users to build complex logic through conditional branching, looping, and sub-workflow execution. The platform supports both pre-built integration nodes and custom code execution in JavaScript or Python, enabling connectivity with a wide range of external services and APIs.

The platform includes a suite of generative AI capabilities, such a
- [ivopetiz/crypto-database](https://awesome-repositories.com/repository/ivopetiz-crypto-database.md) (107 ⭐) — Database to store all data from crypto exchanges, currently working with Binance, Bittrex, Cryptopia and Poloniex.
- [thepeoplesbourgeois/d3-vector](https://awesome-repositories.com/repository/thepeoplesbourgeois-d3-vector.md) (4 ⭐) — A D3 force plug-in for programmatically determining a set of vectors' angles and magnitudes
- [pubkey/rxdb](https://awesome-repositories.com/repository/pubkey-rxdb.md) (23,048 ⭐) — This project is a reactive, offline-first NoSQL database engine designed for JavaScript applications. It provides a robust framework for managing application state by synchronizing data across browsers, mobile devices, and server-side runtimes. By treating local storage as the primary source of truth, it enables applications to remain functional without network connectivity, automatically reconciling changes with remote backends once a connection is restored.

The database distinguishes itself through a modular architecture that supports cross-environment synchronization and high-performance d
- [eto-ai/lance](https://awesome-repositories.com/repository/eto-ai-lance.md) (6,671 ⭐) — Lance is a versioned columnar data format and storage engine designed as a multimodal AI lakehouse. It serves as a vector database storage engine and a cloud object store dataset manager, organizing images, video, audio, and embeddings into a unified format optimized for machine learning workflows.

The project distinguishes itself by combining a columnar layout for structured data with a specialized blob store for large multimodal tensors. It implements a hybrid search engine that integrates vector similarity search, full-text search, and SQL analytics on a single dataset, supported by a stor
- [alibaba/zvec](https://awesome-repositories.com/repository/alibaba-zvec.md) (5,198 ⭐) — zvec is an embedded vector database engine and indexing library designed for high-dimensional similarity search. It functions as a hybrid search engine and a retrieval-augmented generation knowledge base, allowing for the storage and retrieval of dense and sparse vectors.

The system is distinguished by its hybrid retrieval pipeline, which fuses vector similarity, full-text keyword matching, and scalar metadata filtering into single query operations. It supports a plugin-based model integration system for registering custom embedding models and rerankers, as well as language bindings for nativ
- [n8n-io/self-hosted-ai-starter-kit](https://awesome-repositories.com/repository/n8n-io-self-hosted-ai-starter-kit.md) (14,997 ⭐) — This project provides a dockerized AI workflow stack and orchestration templates for deploying a self-hosted AI environment. It establishes a localized infrastructure for building autonomous agents and model chains that process private data on-premises without external cloud dependencies.

The environment is designed to support autonomous agent development, allowing models to dynamically select tools, execute shell commands, and interact with local file systems. It includes integrated vector database support to enable retrieval augmented generation and private document analysis.

The stack cov
- [vectordotdev/vector](https://awesome-repositories.com/repository/vectordotdev-vector.md) (22,071 ⭐) — Vector is a high-performance observability data pipeline designed to collect, transform, and route logs, metrics, and traces across distributed infrastructure. It functions as a modular engine that decouples data ingestion from processing and transmission, utilizing a component-based architecture to connect diverse sources to multiple destinations.

The project distinguishes itself through a focus on reliability and flow control. It implements backpressure-aware data movement to prevent data loss during traffic spikes and utilizes disk-backed event buffering to ensure durability during network
- [automattf/vector.lua](https://awesome-repositories.com/repository/automattf-vector-lua.md) (63 ⭐) — a simple vector library for Lua based on the PVector class from processing
- [anthropics/claude-cookbooks](https://awesome-repositories.com/repository/anthropics-claude-cookbooks.md) (45,835 ⭐) — This repository serves as a comprehensive library of architectural blueprints and code examples for integrating large language models into software applications. It functions as a developer learning resource, providing structured tutorials and implementation patterns that demonstrate how to build intelligent features using advanced prompting and data processing techniques.

The collection distinguishes itself by focusing on complex reasoning and data-grounding workflows. It provides practical guidance on implementing retrieval-augmented generation pipelines, which connect language models to pr
- [redis/go-redis](https://awesome-repositories.com/repository/redis-go-redis.md) (22,159 ⭐) — This project is a feature-rich Go client library designed for interacting with Redis. It serves as a comprehensive interface for managing remote data stores, enabling developers to execute standard database commands, handle complex data structures, and perform asynchronous operations within Go applications.

The library distinguishes itself through its support for advanced Redis capabilities, including connection pooling, pipelining, and transactional integrity. It provides specialized primitives for managing distributed clusters, including automated topology updates and request routing to sha
- [meilisearch/meilisearch](https://awesome-repositories.com/repository/meilisearch-meilisearch.md) (58,118 ⭐) — Meilisearch is a Rust-based search engine providing typo-tolerant full-text and vector-based semantic search with real-time conversational capabilities.
- [drizzle-team/drizzle-orm](https://awesome-repositories.com/repository/drizzle-team-drizzle-orm.md) (34,835 ⭐) — Drizzle ORM is a TypeScript-native database toolkit providing type-safe SQL query building, schema management, and automated migrations across PostgreSQL, MySQL, SQLite, and SingleStore.
- [rust-embedded/awesome-embedded-rust](https://awesome-repositories.com/repository/rust-embedded-awesome-embedded-rust.md) (7,927 ⭐) — Curated list of resources for Embedded and Low-level development in the Rust programming language
- [geldata/gel](https://awesome-repositories.com/repository/geldata-gel.md) (14,065 ⭐) — Gel is an object-relational database system that models data as a graph of interconnected objects. By utilizing a strongly typed schema, it enables complex relational queries and polymorphic data structures without the need for traditional join tables. The system integrates native vector storage and similarity search operators, allowing it to function as both a relational and a vector database for semantic data retrieval.

The platform distinguishes itself through a comprehensive suite of developer-centric automation tools. It features a declarative migration system that tracks and versions sc
- [avelino/awesome-go](https://awesome-repositories.com/repository/avelino-awesome-go.md) (175,576 ⭐) — This project serves as a comprehensive language ecosystem index, functioning as a centralized, community-curated directory for the Go programming language. It organizes a vast landscape of software components, libraries, and development tools into a structured, navigable hierarchy, enabling developers to efficiently discover resources tailored to specific functional domains.

The repository distinguishes itself through a decentralized contribution model, where community-driven updates ensure the index remains current with the rapidly evolving software landscape. Beyond simple resource listing,
- [alexdrone/store](https://awesome-repositories.com/repository/alexdrone-store.md) (501 ⭐) — Unidirectional, transactional, operation-based Store implementation.
- [get-convex/convex-backend](https://awesome-repositories.com/repository/get-convex-convex-backend.md) (11,947 ⭐) — Convex is a serverless backend platform that provides a real-time reactive database, serverless functions, and state synchronization for web applications. It manages relational JSON documents using ACID-compliant transactions and schema validation to ensure data consistency and integrity.

The platform distinguishes itself by synchronizing database state with clients via WebSockets, allowing user interfaces to update automatically as data changes. It also includes a specialized vector search database for performing semantic search using embeddings and supports both cloud-native deployment and
- [illuminate/database](https://awesome-repositories.com/repository/illuminate-database.md) (2,766 ⭐) — [READ ONLY] Subtree split of the Illuminate Database component (see laravel/framework)