We curate 27 open-source GitHub repositories matching "ai agent memory and persistent context". Results are ranked by relevance to your query — pick filters below to narrow, or refine with AI.
LangChain4j is a framework and library for building applications powered by large language models on the JVM. It provides a unified API for developing AI agents, implementing retrieval augmented generation, and integrating generative AI capabilities into professional software built with frameworks like Spring Boot or Quarkus. The project enables the creation of autonomous agents that can reason through tasks, manage memory, and execute external tools to achieve specific goals. It differentiates itself through a unified model interface that allows developers to switch between multiple model pr
LangChain4j is a JVM framework for building LLM-powered applications with built-in agent memory management, vector database integrations (Pinecone, Milvus, Chroma, pgvector), conversation history persistence, and RAG support, making it a comprehensive tool for retaining context across AI agent sessions.
This project is a Java-based framework integration that provides an AI agent runtime, a graph-based AI workflow engine, and an LLM orchestration framework for Spring applications. It enables the development of stateful autonomous agents and the implementation of retrieval-augmented generation systems using document processing and vector databases. The framework distinguishes itself through a graph-based workflow runtime for designing complex AI pipelines with conditional routing and persistent state. It supports multi-agent orchestration via service-discovery coordination and provides human-i
Spring AI Alibaba is a Java framework that provides a stateful AI agent runtime with graph-based workflows, vector database integration, and retrieval-augmented generation, directly addressing persistent context and memory across sessions for AI agents.
Zep is a long-term memory layer and persistent storage system for large language model applications. It functions as a memory service and vector database orchestrator that manages chat history, user preferences, and context retrieval to reduce hallucinations in AI agents. The system maintains a temporal knowledge graph that stores interaction data as dated facts to track how user preferences and environments evolve over time. It combines these knowledge graphs with a store for persisting unstructured message data at the user and session levels. The platform provides capabilities for AI conte
Zep is purpose-built as a long-term memory layer and persistent storage system for LLM-powered AI agents, managing chat history, context retrieval, and temporal knowledge graphs to retain information across sessions — directly matching the need for an agent memory store with vector database orchestration and conversation persistence.
PocketFlow is a graph-based framework for designing and executing large language model operations and reasoning patterns. It serves as an orchestrator for building goal-oriented autonomous agents, multi-agent systems, and retrieval-augmented generation pipelines. The system is distinguished by its ability to coordinate autonomous AI agents that use shared memory and tools to solve complex goals, supported by a structured output engine that enforces schema-consistent responses. It utilizes graph-based workflow orchestration to manage sequences of model operations and supports supervisor-based
PocketFlow is an agent orchestration framework built around shared memory, conversation history, and cross-node state sharing, with explicit support for RAG pipelines and tool-use memory — it directly delivers the persistent context store and memory integration this search asks for.
Koog is an LLM agent framework used to build autonomous entities that execute tool-based workflows. It utilizes a graph-based workflow engine to define agent behaviors and decision paths as a directed graph of nodes and edges. The framework distinguishes itself through a model provider orchestrator that enables dynamic switching, load balancing, and automatic fallbacks between different AI backends. It implements the Model Context Protocol to connect agents to remote tool servers and features a RAG memory system using vector embeddings to maintain long-term conversation context. The project
Koog is an LLM agent framework with a built-in RAG memory system using vector embeddings to persist conversation context across sessions, directly matching your need for an AI agent memory and persistent context store.
The BeeAI Framework is an LLM agent framework and multi-agent orchestration engine used to build autonomous agents that coordinate reasoning, tool execution, and complex workflows. It functions as a structured AI output controller and RAG integration library, providing a unified interface to manage multiple language model providers. The framework is distinguished by its implementation of the Model Context Protocol, allowing agents, tools, and models to be shared between different AI platforms and hosted as agentic tooling servers. It enables the design of collaborative agent teams through dec
The BeeAI Framework is an agent-building toolkit that includes explicit Agent Memory Extenders, state serialization, and RAG integration — all the core pieces needed to give AI agents persistent context across sessions.
Letta is a framework for building, deploying, and managing autonomous AI agents that maintain persistent state across long-term interactions. It provides a comprehensive suite of primitives for defining agents with configurable personas, modular memory blocks, and tool-use capabilities, enabling them to retain user preferences and conversation history over extended sessions. The platform distinguishes itself through its advanced memory management and orchestration capabilities. It allows agents to autonomously update their own memory, perform retrieval-augmented generation, and coordinate com
Letta is a dedicated framework for building autonomous AI agents with persistent memory, modular memory blocks, and retrieval-augmented generation, perfectly matching the need for a tool that retains context across sessions.
MemGPT is a memory management framework and external memory layer for large language models. It functions as a platform for building stateful AI agents that maintain a persistent identity and continuous context across multiple sessions. The system enables agents to bypass fixed context window limitations by using a virtual context windowing approach. This allows models to manage their own memory through internal commands to search, update, and delete stored information within a hierarchical structure of short-term working context and long-term archival storage. The framework provides a local
MemGPT is a dedicated memory management framework for AI agents that provides persistent context, hierarchical memory, and virtual context windows, directly fulfilling the need for a tool to retain agent information across sessions.
Claude-mem is an agentic memory persistence system designed to provide AI assistants with long-term context across multiple development sessions. It functions as a background orchestrator that captures, summarizes, and indexes interaction history, allowing models to maintain continuity and recall technical decisions from past tasks. By utilizing a vector-augmented context engine, the system injects relevant historical observations into active sessions, ensuring that AI agents remain informed without exceeding finite token budgets. The project distinguishes itself through an endless memory arc
Claude-mem is an agentic memory persistence system that integrates vector databases (ChromaDB) for RAG-based context retrieval, captures and summarizes conversation history, and manages context windows across sessions — exactly the kind of AI agent memory store this search targets, with most of the required features already in place.
Embedchain is an LLM memory management framework and RAG orchestration engine designed to provide AI agents with a persistent storage layer. It functions as a long-term memory pipeline that extracts facts from unstructured interactions and stores them as permanent knowledge base entries to retain user preferences and interaction history across sessions. The system employs a hybrid vector database interface that combines semantic embeddings with traditional keyword search. It utilizes an entity-linking knowledge graph to connect related information points and applies temporal ranking to distin
Embedchain is an LLM memory management and RAG orchestration engine that provides a persistent, vector-backed memory pipeline for AI agents, directly addressing the need for retaining information across sessions with hybrid search, knowledge graphs, and temporal ranking.
Mastra is an orchestration framework designed for building, deploying, and managing autonomous AI agents and multi-agent systems. It provides a comprehensive suite of primitives for creating resilient AI applications, including durable workflow orchestration, event-driven agent loops, and semantic memory management. By integrating these core components, the platform enables developers to build complex, multi-step processes that can reason about goals and execute tasks without manual intervention. The framework distinguishes itself through its focus on observability and secure, isolated execut
Mastra is an agent orchestration framework that includes built-in semantic memory management, durable stateful context, and workflow orchestration—directly providing the persistent memory and context across sessions you need for AI agents, with support for external backends and retrieval-augmented generation via its extensible design.
langchaingo is an LLM application framework for Go designed for building language model-powered applications and autonomous agents. It serves as an orchestration library and tool integration framework that allows developers to link prompt sequences and model calls into complex, multi-step workflows. The project provides a toolkit for implementing retrieval-augmented generation pipelines by processing unstructured documents and retrieving relevant context via vector search. It includes a dedicated integration layer for indexing high-dimensional embeddings and performing similarity searches acr
tmc/langchaingo is an LLM application framework in Go that includes vector database integrations, conversation state management, and RAG pipelines, making it a suitable tool for providing persistent context and memory for AI agents.
AIOS is an LLM agent operating system and orchestration kernel designed to manage memory, resource scheduling, and tool execution for multiple autonomous AI agents. It serves as a comprehensive framework for developing and deploying agents, featuring a dedicated resource manager that coordinates model backends, GPU memory, and isolated kernel instances. The system distinguishes itself through a semantic memory engine that uses vector search and autonomous clustering for long-term knowledge management, and a semantic file system that allows users to control computer files and system operations
AIOS is an LLM agent operating system that manages memory, resource scheduling, and tool execution, with a semantic memory engine using vector search for long-term knowledge—exactly the persistent context framework this search needs.
Agno is an agent operating system designed to manage the lifecycle, tool execution, and persistent state of autonomous agents across distributed infrastructure. It provides a unified runtime environment that wraps diverse agent frameworks into a consistent, interoperable protocol, allowing developers to build and deploy complex multi-agent systems that coordinate tasks and delegate sub-processes. The platform distinguishes itself through a robust governance and orchestration layer that includes human-in-the-loop approval gates, role-based access control, and a centralized API gateway. It feat
Agno is an agent operating system that manages persistent state and includes conversation history management, context window management, and memory retrieval features, directly supporting the AI agent memory and context store use case.
Eliza is a modular framework designed for building and deploying autonomous agents that operate across diverse digital environments. It functions as an orchestrator for intelligent software, enabling agents to manage tasks, maintain persistent memory, and execute automated processes through a centralized runtime. The framework distinguishes itself through a plugin-based architecture that facilitates cross-platform social automation and blockchain transaction capabilities. By utilizing state-machine logic for decision-making and vector-based memory for context retention, the system allows agen
Eliza is a modular agent framework explicitly built around persistent memory and vector-based context retention, with plugin-based architecture and RAG support that directly address the sought-after features for AI agent memory and cross-session context management.
MemMachine is a centralized memory management server and model-agnostic memory layer for large language models. It functions as a persistence layer that stores user profiles and conversational context, providing a decoupled data store that prevents vendor lock-in by serving different AI models through a consistent API. The system implements the Model Context Protocol to share persistent agent memories and session data with compatible AI clients. It utilizes a multi-tiered memory hierarchy, combining a graph-based conversation store for episodic interactions with a vector knowledge base for se
MemMachine is a centralized memory management server that provides a model-agnostic, persistent memory layer for AI agents, combining graph-based conversation stores with a vector knowledge base and supporting the Model Context Protocol for cross-session context sharing—directly matching the need for an open-source tool that retains agent information across sessions.
Mem0 is an agent-agnostic memory layer designed to provide intelligent agents with long-term persistence and cross-session state management. By acting as a centralized service, it allows diverse AI agents to recall user preferences, past interactions, and historical context, ensuring continuity across multiple workflows and independent agent systems. The platform distinguishes itself through a multi-signal retrieval engine that combines semantic vectors, keyword matching, and entity-linked metadata to surface the most relevant information. It employs an adaptive memory engine that automatical
Mem0 is a dedicated memory layer for AI agents that provides vector-based retrieval, persistent conversation and state management across sessions, covering the core needs of cross-session context retention and RAG support.
LangChain is an orchestration framework designed for building, managing, and deploying applications powered by large language models. It provides a unified integration layer that normalizes disparate model provider APIs into a consistent set of primitives, enabling developers to build complex, multi-step AI workflows that manage state, memory, and tool execution. The project distinguishes itself through a durable execution runtime that maintains persistent state across long-running processes by checkpointing progress to external storage. It models agent workflows as directed graphs, allowing
LangChain is an orchestration framework that natively manages persistent state, memory, and context for AI agents through checkpointing and external storage backends, making it a comprehensive fit for the requested AI agent memory/persistent context store use case.
LlamaIndex is a comprehensive development framework designed to connect private or external data sources to large language models. It functions as a data-centric toolkit that enables the construction of retrieval-augmented generation systems, allowing developers to build applications that provide context-aware answers based on specific organizational information. The project distinguishes itself through a robust agentic orchestration engine that supports the creation of autonomous agents capable of multi-step reasoning, memory management, and complex tool execution. Beyond simple retrieval, i
LlamaIndex is a framework for building LLM applications with built-in agent memory management, vector database integration, and RAG support, making it a comprehensive solution for persisting agent context across sessions.
Langchain-Chatchat is a system for building retrieval-augmented generation applications and autonomous AI agents. It integrates a knowledge base management system and an agent framework to enable language models to interact with private documents and execute multi-step tasks through external tools. The platform supports local deployment of language models on private infrastructure to operate without an internet connection. It includes a multimodal AI platform that combines vision models for image analysis with text-to-image generation capabilities. The system provides a web-based conversatio
Langchain-Chatchat is a full-featured platform for building autonomous AI agents with integrated knowledge base management, vector database support (FAISS, Milvus), and RAG capabilities, directly matching the need for persistent context and session-spanning memory.
LangGraph is a framework for building stateful, multi-step agentic workflows by modeling application logic as a directed graph. It provides a runtime environment where complex tasks are orchestrated through interconnected nodes and edges, allowing developers to manage state transitions, persistent memory, and control flow across long-running automated processes. The platform distinguishes itself through its native support for human-in-the-loop automation, enabling developers to define breakpoints that pause execution for manual review, modification, or approval. It also features checkpoint-ba
LangGraph is a framework for building stateful agentic workflows with built-in persistent memory via checkpoint-based state serialization, and it integrates deeply with LangChain for vector databases, RAG, and external backends — making it a comprehensive match for AI agent memory and context retention across sessions.
Agentscope is a comprehensive toolkit for developing and orchestrating autonomous multi-agent systems. It provides a unified framework for building agents that can reason, execute tools, and manage memory, enabling the creation of complex, collaborative workflows where multiple specialized agents interact to solve multi-step objectives. The platform distinguishes itself through a robust orchestration engine that supports both sequential and concurrent agent pipelines. It utilizes a centralized event bus for real-time telemetry, allowing developers to track agent reasoning, tool usage, and sys
Agentscope is a framework for building multi‑agent systems that includes built‑in memory management, state serialization, and tool‑use tracking, directly addressing the need for persistent context across sessions.
This project is a Python framework for building autonomous, event-driven agent systems. It provides a unified runtime for orchestrating multi-agent workflows, managing persistent conversation state, and executing code within secure, isolated sandbox environments. The framework is designed to handle complex task delegation, allowing agents to invoke other agents as tools while maintaining context across multi-turn interactions. The framework distinguishes itself through its deep integration with the Model Context Protocol, enabling agents to connect to external data sources and remote services
OpenAI Agents Python is a framework for building autonomous agents with built-in persistent conversation state and context management across multi-turn interactions, fitting the search for AI agent memory, though it does not explicitly include vector database integration.
This project is an autonomous agent framework designed to integrate large language models with popular messaging platforms. It functions as a middleware platform that enables automated, multimodal interactions by decomposing complex user goals into sequential plans, executing them through external tools, and maintaining persistent context across sessions. The framework distinguishes itself through a modular skill architecture and a hybrid memory system. Users can extend system capabilities by installing custom logic modules from community hubs or generating them through natural language. The
ChatGPT-on-Wechat is an autonomous agent framework that includes a hybrid memory system for maintaining persistent context across sessions and supporting tool-use interaction memory, making it a suitable tool for agent memory needs even if some features like explicit vector database integration are not highlighted.
This is an open-source framework for building stateful, durable AI agents that run on Cloudflare Workers. It provides a runtime for long-lived agents that maintain a persistent identity, local SQL storage, and real-time connections, utilizing a lifecycle where agents hibernate when idle and wake on demand. The project distinguishes itself through its multi-channel orchestration, allowing a single agent to be deployed across voice, email, and chat interfaces with unified state. It implements the Model Context Protocol for standardized tool and data exchange and includes a dedicated framework f
cloudflare/agents is a framework for building stateful, durable AI agents with persistent identity and local SQL storage, making it the right kind of tool for maintaining memory across sessions—it covers conversation history persistence and agent state serialization, though it lacks explicit vector database and RAG integration.
Flowise is a low-code platform designed for building and deploying complex language model workflows through a visual, node-based interface. It functions as an orchestrator for autonomous multi-agent systems, allowing users to construct conversational pipelines by connecting language models, memory stores, and external tools on a drag-and-drop canvas. The platform distinguishes itself through its support for sophisticated agentic patterns, including supervisor-worker delegation and iterative reasoning strategies. Users can design directed acyclic graphs to manage conditional branching, state p
Flowise is a low-code platform for building LLM workflows that natively integrates memory stores, RAG, and agent orchestration, making it a solid match for creating AI agents with persistent context even though it focuses on visual pipeline building rather than being a dedicated memory library.
Memvid is an embedded memory framework designed to provide persistent, versioned context for intelligent agents. It functions as a local vector database library that stores all data within a single binary file, removing the need for external database infrastructure or network dependencies. The system distinguishes itself by integrating in-process vector indexing with append-only versioning, allowing for high-speed semantic similarity searches alongside the ability to track and roll back state changes over time. It includes built-in transparent data encryption and masking to secure sensitive i
Memvid is an embedded memory framework that provides persistent, versioned context for AI agents with built-in vector indexing and RAG support, making it a suitable tool for retaining information across sessions.