30 open-source projects similar to embedchain/embedchain, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Embedchain alternative.
Memori is an AI agent memory middleware platform designed to provide persistent, context-aware recall for language models. It functions as a non-intrusive layer that intercepts outbound model requests to automatically capture interaction history and execution traces, ensuring that agents maintain continuity across sessions without requiring modifications to existing application logic. The platform distinguishes itself through a dual-model storage architecture that maintains information as both structured relational primitives for precise fact retrieval and rolling narrative summaries for situ
This project is a retrieval-augmented generation pipeline designed for building custom ChatGPT plugins that allow language models to query private or professional documents. It implements a full retrieval workflow, from processing and indexing document chunks to retrieving relevant context for natural language queries. The system distinguishes itself through a hybrid retrieval approach that combines dense vector embeddings with sparse keyword matching, further refined by a two-stage semantic re-ranking process. It includes specialized data privacy tools for screening personally identifiable i
LangChain is an orchestration framework designed for building, managing, and deploying applications powered by large language models. It provides a unified integration layer that normalizes disparate model provider APIs into a consistent set of primitives, enabling developers to build complex, multi-step AI workflows that manage state, memory, and tool execution. The project distinguishes itself through a durable execution runtime that maintains persistent state across long-running processes by checkpointing progress to external storage. It models agent workflows as directed graphs, allowing
LangChain is a framework for building applications that chain large language models with external data sources and third-party tools. It serves as an orchestrator for autonomous agents that use language models to plan and execute multi-step tasks, while providing a toolkit for linking interoperable AI components into sequences to prototype complex model behaviors. The project provides a model agnostic integration layer, allowing users to switch between different language model providers using a standardized interface. It also includes tools for observability and evaluation to track the perfor
Graphiti is a backend framework and memory server designed to provide artificial intelligence agents with persistent, time-aware knowledge graph storage. It functions as a memory layer that enables agents to maintain context across long-term interactions by recording and evolving structured data over time. The system distinguishes itself through a specialized temporal graph database that tracks how entities and relationships change using validity windows. By combining semantic vector similarity, keyword matching, and graph topology traversal, the engine performs hybrid retrieval to locate rel
MemGPT is a memory management framework and external memory layer for large language models. It functions as a platform for building stateful AI agents that maintain a persistent identity and continuous context across multiple sessions. The system enables agents to bypass fixed context window limitations by using a virtual context windowing approach. This allows models to manage their own memory through internal commands to search, update, and delete stored information within a hierarchical structure of short-term working context and long-term archival storage. The framework provides a local
LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The system distinguishes itself through a combination of cloud-native object storage and immutable version tracking, allowing for data time-travel and reproducible AI experiments. It integrates hybrid search capabilities, merging dense vector similarity with BM25 full-text search and SQL-like scalar filters
Owl is a framework for agentic workflow automation and multi-agent orchestration. It functions as a system for coordinating autonomous large language model agents to decompose and execute complex tasks through shared communication and collaborative planning. The project distinguishes itself through a multi-modal toolset for processing images, audio, and video, alongside a synthetic data generator that produces domain-specific datasets using self-instruct and verifier loops. It further incorporates a retrieval-augmented generation pipeline framework that integrates long-term memory and real-ti
Hermes-agent is an autonomous AI agent framework and runtime designed to execute complex tasks and synthesize new skills from execution traces. It includes a provider-agnostic gateway for routing requests across multiple model backends and a serverless runtime that suspends idle agent instances and resumes them on demand across containers and virtual machines. The project provides a desktop automation toolset that controls native GUI workflows on Linux by querying accessibility APIs and injecting input events. It further distinguishes itself with the ability to generate procedural skills from
Letta is a framework for building, deploying, and managing autonomous AI agents that maintain persistent state across long-term interactions. It provides a comprehensive suite of primitives for defining agents with configurable personas, modular memory blocks, and tool-use capabilities, enabling them to retain user preferences and conversation history over extended sessions. The platform distinguishes itself through its advanced memory management and orchestration capabilities. It allows agents to autonomously update their own memory, perform retrieval-augmented generation, and coordinate com
AIOS is an LLM agent operating system and orchestration kernel designed to manage memory, resource scheduling, and tool execution for multiple autonomous AI agents. It serves as a comprehensive framework for developing and deploying agents, featuring a dedicated resource manager that coordinates model backends, GPU memory, and isolated kernel instances. The system distinguishes itself through a semantic memory engine that uses vector search and autonomous clustering for long-term knowledge management, and a semantic file system that allows users to control computer files and system operations
Mastra is an orchestration framework designed for building, deploying, and managing autonomous AI agents and multi-agent systems. It provides a comprehensive suite of primitives for creating resilient AI applications, including durable workflow orchestration, event-driven agent loops, and semantic memory management. By integrating these core components, the platform enables developers to build complex, multi-step processes that can reason about goals and execute tasks without manual intervention. The framework distinguishes itself through its focus on observability and secure, isolated execut
This project is a comprehensive framework for building and managing autonomous agent systems. It provides a unified architecture for orchestrating multi-agent societies, where specialized agents collaborate through roleplay to decompose and solve complex tasks. The system integrates language models with external environments, enabling agents to perform real-world actions through a standardized tool-calling abstraction layer. The framework distinguishes itself through its focus on iterative reasoning and data reliability. It employs automated feedback loops to refine agent outputs and self-eva
SimpleMem is a persistent memory system for AI assistants designed to maintain context across different user chat sessions. It functions as a memory server and multimodal vector database that stores and retrieves information from text, images, audio, and video. The project features a context compression engine that distills interaction histories into compact units to reduce token consumption. It utilizes a distributed memory orchestrator and worker-thread parallel processing to reduce latency when querying large-scale dialogue datasets. The system implements a hybrid indexing approach combin
memU is a long-term memory system for AI agents that provides a persistent knowledge base. It extracts facts and preferences from conversations into structured memories, organizing this information through a hierarchical knowledge base based on a file-system architecture of nested categories and linked resources. The system includes a multimodal data ingestion pipeline that converts audio, video, and images into standardized natural language for storage in large language model contexts. It also features a model provider abstraction layer, offering a unified interface to use interchangeable la
MemOS is an open-source persistent memory layer for AI agents and large language models, providing a self-hosted server that stores and retrieves structured memory across sessions. It enables AI systems to recall user preferences, history, and context without retraining, using a graph-based API and a web management interface for viewing, editing, and organizing memory items, skills, traces, and knowledge bases. The system distinguishes itself through a portable memory interchange protocol that allows memory to be transferred between different AI models, devices, and applications, along with a
MemMachine is a centralized memory management server and model-agnostic memory layer for large language models. It functions as a persistence layer that stores user profiles and conversational context, providing a decoupled data store that prevents vendor lock-in by serving different AI models through a consistent API. The system implements the Model Context Protocol to share persistent agent memories and session data with compatible AI clients. It utilizes a multi-tiered memory hierarchy, combining a graph-based conversation store for episodic interactions with a vector knowledge base for se
AgentMemory is a persistent knowledge store and memory server designed to provide AI coding agents with long-term memory. It functions as a knowledge graph engine and vector database store that saves and recalls project context, architectural decisions, and patterns across different sessions. The system distinguishes itself by using a tiered-memory consolidation pipeline that compresses raw observations into episodic, semantic, and procedural layers to optimize token usage. It employs a hybrid retrieval strategy combining keyword matching, vector embeddings, and graph traversal to surface rel
RedisInsight is a graphical user interface and management tool for browsing, analyzing, and administering Redis databases. It provides a visual environment for exploring key-value data structures, managing database instances, and performing data analysis across different operating systems and deployments. The tool distinguishes itself by providing dedicated visual managers for complex operations, including a vector database manager for configuring embeddings and similarity searches, a query workbench for executing raw commands and Lua scripts, and a performance monitoring dashboard for tracki
zvec is an embedded vector database engine and indexing library designed for high-dimensional similarity search. It functions as a hybrid search engine and a retrieval-augmented generation knowledge base, allowing for the storage and retrieval of dense and sparse vectors. The system is distinguished by its hybrid retrieval pipeline, which fuses vector similarity, full-text keyword matching, and scalar metadata filtering into single query operations. It supports a plugin-based model integration system for registering custom embedding models and rerankers, as well as language bindings for nativ
Orama is a search engine and vector database that provides full-text indexing, geospatial calculations, and semantic vector storage. It functions as an LLM retrieval engine designed to provide grounded context to language models for conversational interfaces. The project implements hybrid search by combining dense vector embeddings with inverted keyword indices to retrieve documents based on both semantic meaning and exact text matches. It utilizes a WebAssembly module to execute search logic across different JavaScript environments and platforms. The system covers a broad range of retrieval
Mempalace is a long-term memory management system for large language models that orchestrates the storage and retrieval of conversation history and entity relationships. It functions as a memory orchestrator and Model Context Protocol server, providing AI clients with read and write access to structured knowledge. The system utilizes a temporal knowledge graph to track evolving entity relationships and timelines with validity windows. It employs a hierarchical memory partitioning strategy, organizing data into wings and rooms to isolate specialist agent contexts and restrict semantic searches
Cognee is an agentic memory management platform designed to provide autonomous agents with long-term semantic recall and structured knowledge. It functions as a framework for building persistent memory systems that connect large language models to graph-based knowledge and vector storage, enabling agents to maintain context across complex tasks and multiple sessions. The platform distinguishes itself through a hybrid approach that combines semantic similarity search with structural graph traversal, allowing for context-aware information retrieval. It features a modular architecture that orche
llmware is a Python framework for AI agent orchestration and model management, designed to coordinate multi-model workflows and autonomous agents. It provides a unified model catalog and standardized interface to execute specialized language models for complex research, analysis, and structured data generation. The project distinguishes itself through its heavy emphasis on local execution and quantized inference, allowing models to run on private infrastructure using CPU, GPU, and NPU acceleration via runtimes like ONNX and OpenVino. It features a specialized ability to translate natural lang
QAnything is a retrieval-augmented generation application framework and self-hosted AI interface. It functions as a system that combines a vector database knowledge base, a document parsing service, and a hybrid search engine to generate answers based on private user data. The project features a modular pipeline architecture that allows users to independently replace components such as parsers, embedding models, and reranking engines. It supports local-first model deployment and offline operation to ensure data privacy, and includes a two-stage retrieval pipeline that merges dense vector embe
Claude-context is a retrieval-augmented generation pipeline and semantic code search tool. It functions as an LLM codebase indexer and RAG context provider, designed to index local directories and retrieve relevant code files to provide context for large language models. The system operates as a hybrid search engine that combines keyword matching with dense vector search. This allows for the retrieval of code snippets and logic using natural language queries based on meaning rather than exact text matches. The project covers codebase indexing and search index management, utilizing asynchrono
R2R is an agentic retrieval-augmented generation platform that uses reasoning agents to perform multi-step data fetching for context-aware answering. It functions as a multimodal vector database manager and knowledge graph engine designed to ground artificial intelligence responses in verified factual knowledge. The platform distinguishes itself by combining reasoning agents for complex research automation with a knowledge graph that maps entity relationships. This allows the system to perform structured data traversal alongside unstructured vector search to resolve complex questions from int
Paper-qa is a retrieval augmented generation system designed for question answering and analysis of scientific literature and technical documents. It functions as an LLM-powered research assistant that extracts grounded answers and summaries with citations from a document library. The system utilizes an agentic RAG orchestrator to iteratively refine search queries and gather evidence through multi-step tool calling. It features a multimodal document parser that extracts text, tables, and images from PDFs, alongside a vector-based indexer that embeds and caches document libraries for efficient
FinGPT is a suite of specialized financial tools and a framework for adapting large language models to the financial domain. It provides a set of pipelines for financial entity extraction, sentiment analysis, and retrieval-augmented generation to improve the accuracy of financial information systems. The project distinguishes itself through efficient training workflows, utilizing low-rank adaptation and quantized low-rank adaptation to fine-tune models on consumer-grade hardware. It employs market-labeled datasets and reinforcement learning that uses actual stock price movements as reward sig
qmd is a local semantic search engine and RAG knowledge base indexer that functions as a Model Context Protocol server. It converts local documents, markdown files, and codebases into a searchable database to provide retrieval augmented generation capabilities for AI agents. The system exposes its search and retrieval tools via stdio or HTTP. It utilizes local model files for embeddings and reranking, supporting query expansion across multiple languages. The project employs abstract syntax tree based chunking to split source code at function and class boundaries. It implements hybrid vector-