MemOS is an open-source persistent memory layer for AI agents and large language models, providing a self-hosted server that stores and retrieves structured memory across sessions. It enables AI systems to recall user preferences, history, and context without retraining, using a graph-based API and a web management interface for viewing, editing, and organizing memory items, skills, traces, and knowledge bases.
The system distinguishes itself through a portable memory interchange protocol that allows memory to be transferred between different AI models, devices, and applications, along with a three-tier memory architecture that organizes information into Skills, Traces/Episodes, and World Models. It stores persistent memory as human-readable Markdown files on local SQLite storage rather than opaque vector databases, and supports hybrid semantic-lexical retrieval that combines vector cosine similarity with BM25 lexical search. Additional differentiators include predictive memory preloading that loads relevant context before it is needed, asynchronous memory ingestion for millisecond-level latency under high concurrency, and memory-based skill crystallization that extracts repeated strategies into callable, versioned skills.
The platform offers composable knowledge base cubes for managing multiple isolated knowledge bases with controlled sharing across users, projects, and agents, along with per-agent memory isolation and feedback-based memory refinement. It supports multi-modal memory storage for text, images, tool traces, and personas, and provides a CLI tool for cross-agent memory sharing. Memory lifecycle management includes full CRUD operations, batch cleanup, tagging, and governance, while retrieval budget control limits token consumption by capping the number of memories recalled per task. The system also enables team skill sharing over LAN or VPN and supports data import and export in JSON, legacy plugin, and agent-specific formats.
Documentation covers deployment across public cloud, private cloud, on-premises, and hybrid architectures, with the server running on SQLite for privacy and offline operation.