30 open-source projects similar to docker/genai-stack, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Genai Stack alternative.
This project is a comprehensive framework for building and managing autonomous agent systems. It provides a unified architecture for orchestrating multi-agent societies, where specialized agents collaborate through roleplay to decompose and solve complex tasks. The system integrates language models with external environments, enabling agents to perform real-world actions through a standardized tool-calling abstraction layer. The framework distinguishes itself through its focus on iterative reasoning and data reliability. It employs automated feedback loops to refine agent outputs and self-eva
FalkorDB is a high-performance graph database management system and vector graph database. It serves as a knowledge graph construction tool and a GraphRAG knowledge store, integrating structured property graphs with vector search to provide grounded context for large language models. The engine is designed as a multi-tenant graph engine, capable of hosting thousands of isolated datasets within a single instance. The system distinguishes itself by using linear algebra for query execution, treating relationship tensors as matrix multiplications to achieve low-latency multi-hop traversals. It ut
pgai is a PostgreSQL AI toolkit and framework designed to integrate large language models and vector embeddings directly into a database. It serves as a bridge for executing machine learning model requests and performing text-to-SQL translations within standard database queries. The project provides an automated vector embedding pipeline that handles the loading, parsing, and chunking of text from tables and unstructured documents. This system utilizes a background worker to synchronize embeddings automatically as source data changes and includes specialized tools for building retrieval-augme
QAnything is a retrieval-augmented generation application framework and self-hosted AI interface. It functions as a system that combines a vector database knowledge base, a document parsing service, and a hybrid search engine to generate answers based on private user data. The project features a modular pipeline architecture that allows users to independently replace components such as parsers, embedding models, and reranking engines. It supports local-first model deployment and offline operation to ensure data privacy, and includes a two-stage retrieval pipeline that merges dense vector embe
Chonkie is a text chunking library designed for retrieval-augmented generation pipelines. It functions as a semantic text splitter and RAG ingestion pipeline, transforming raw text into embedded segments for storage in vector databases. The project distinguishes itself through specialized splitting strategies, including an AST-based code splitter for preserving logical boundaries in source code and a semantic text splitter that uses embedding models to determine boundaries based on meaning. It also provides a vector database ingestor to automate the generation of embeddings and their export t
This project provides a dockerized AI workflow stack and orchestration templates for deploying a self-hosted AI environment. It establishes a localized infrastructure for building autonomous agents and model chains that process private data on-premises without external cloud dependencies. The environment is designed to support autonomous agent development, allowing models to dynamically select tools, execute shell commands, and interact with local file systems. It includes integrated vector database support to enable retrieval augmented generation and private document analysis. The stack cov
Eino is an AI agent development kit and LLM application framework designed for building autonomous agents and orchestrating complex language model workflows. It serves as a multi-agent orchestration engine and workflow orchestrator, providing a graph-based execution model to route data between models, tools, and retrievers. The framework distinguishes itself through a robust set of multi-agent coordination patterns, including supervisor-led management, sequential flows, and autonomous reasoning loops like ReAct. It features advanced agent execution controls such as active turn preemption, che
Langroid is a multi-agent orchestration framework and tool integration suite designed for building complex AI applications. It serves as a multi-modal integration layer that connects diverse local and remote language models with an agentic retrieval-augmented generation system. The project distinguishes itself through a collaborative message-exchange paradigm, allowing specialized agents to delegate tasks hierarchically and coordinate via structured communication. It features an advanced state management system for conversational AI, including the ability to rewind and prune conversation hist
This project is a comprehensive framework for building AI-powered applications, providing a unified toolkit for orchestrating language models, autonomous agents, and interactive user interfaces. It serves as a central library for managing the entire lifecycle of AI interactions, from initial prompt generation and model provider abstraction to complex, multi-step reasoning and tool execution. The framework distinguishes itself through its deep integration with frontend development, specifically by enabling generative user interfaces that render dynamic components directly from model outputs. I
Paper-qa is a retrieval augmented generation system designed for question answering and analysis of scientific literature and technical documents. It functions as an LLM-powered research assistant that extracts grounded answers and summaries with citations from a document library. The system utilizes an agentic RAG orchestrator to iteratively refine search queries and gather evidence through multi-step tool calling. It features a multimodal document parser that extracts text, tables, and images from PDFs, alongside a vector-based indexer that embeds and caches document libraries for efficient
ZenML is an extensible machine learning orchestration framework designed to manage the end-to-end lifecycle of data pipelines and AI agent workflows. It functions as a durable orchestrator that executes machine learning tasks as directed acyclic graphs, ensuring that every step is containerized for consistent performance across local, cloud, and hybrid infrastructure. By decoupling pipeline code from underlying compute and storage backends, the platform allows developers to define infrastructure-agnostic stacks that remain portable across diverse environments. The project distinguishes itself
Superduper is an AI agent development kit and LLM application framework designed to build autonomous agents and data-driven applications. It functions as a RAG orchestration platform and vector search infrastructure, coordinating AI models with database storage to perform multi-step computations and actions using persisted data states. The project distinguishes itself by providing a database-integrated machine learning pipeline that executes training and inference tasks directly on data hosted within SQL and NoSQL databases. It allows for the deployment of self-hosted AI infrastructure on pri
Neo4j is a native graph database management system designed to store and query highly connected data using a property-graph model. It provides an ACID-compliant transaction engine that ensures data integrity, supported by a distributed cluster architecture that maintains causal consistency across nodes. Users interact with the system through a declarative query language, which allows for complex pattern matching and path traversal without requiring manual traversal logic. The platform distinguishes itself through its hybrid approach to data retrieval, combining traditional graph-based queries
LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The system distinguishes itself through a combination of cloud-native object storage and immutable version tracking, allowing for data time-travel and reproducible AI experiments. It integrates hybrid search capabilities, merging dense vector similarity with BM25 full-text search and SQL-like scalar filters
Casibase is an open-source platform that orchestrates multi-turn conversations with large language models and manages retrieval-augmented knowledge bases from a single interface. It provides a unified system for connecting to over 30 AI model providers, ingesting documents into vector embeddings for semantic search, and running autonomous agent loops that can drive a browser, search the web, execute commands, and integrate with external tools. The platform distinguishes itself by combining AI conversation management with infrastructure and application orchestration capabilities. It includes a
This project is a retrieval augmented generation framework designed to build pipelines that connect unstructured data and knowledge graphs with large language models. It functions as a vector database orchestrator for indexing text and multimodal content, as well as a system for translating natural language queries into structured database commands. The framework integrates a hybrid retrieval engine that combines dense vector search with sparse keyword matching to increase the precision of retrieved contexts. It further enhances reasoning and relationship mapping through a graph-augmented ret
AdalFlow is an autonomous AI agent framework and LLM application library designed for building modular workflows. It serves as a model-agnostic interface and RAG pipeline orchestrator, allowing users to develop ReAct agents that utilize iterative reasoning and external tool execution to solve complex tasks. The project distinguishes itself through a prompt optimization system that uses textual gradient descent to automatically refine prompt templates and few-shot examples. It treats model feedback as a differentiable signal, enabling a form of LLM backpropagation to iteratively improve output
Tiny Universe is an educational monorepo that delivers multiple independent implementations of core AI subsystems as self-contained Jupyter notebooks. It provides from-scratch constructions of foundational architectures including a complete Transformer model built from the original paper specification, a denoising diffusion probabilistic model for image generation, and a ReAct-style autonomous agent framework that equips an LLM with tools for planning and multi-step task execution. The project distinguishes itself by covering the full lifecycle of modern AI systems through hands-on implementa
PraisonAI is an autonomous AI agent platform that coordinates multiple LLM-powered agents for research, planning, and execution of complex workflows. It functions as a multi-agent orchestration framework, a workflow builder, and a Model Context Protocol server, while also providing retrieval-augmented generation through vector knowledge bases. Agents can interact via CLI, web, or standardized protocols with sandboxed code execution. The platform distinguishes itself with a rich set of agent communication protocols, including A2A, REST, WebSocket, voice and telephony integration, and MCP, allo
llm-graph-builder is a tool for transforming unstructured data into structured Neo4j graph databases using large language models. It functions as a graph orchestrator that automates the construction of nodes and relationships from raw text based on custom schemas. The project provides a visualizer for analyzing relational data as interactive networks and a token monitor to track daily and monthly API consumption per user. It also includes a vector embedding generator that utilizes configurable model providers to enable semantic search and retrieval augmented generation. The system covers cap
Memgraph is an in-memory, distributed graph database designed for high-performance labeled property graph management. It utilizes a Cypher query engine for declarative data retrieval and manipulation, providing a scalable knowledge graph backend that integrates vector search and graph traversals. The system distinguishes itself as a real-time graph analytics platform, employing native C++ and CUDA implementations to execute complex network analysis and dynamic community detection on streaming data. It provides specialized support for AI integration, including GraphRAG capabilities, the constr
Kotaemon is an orchestration framework designed for building modular, agentic workflows that integrate document processing, retrieval-augmented generation, and multi-step reasoning. It provides a comprehensive platform for developing document-based question answering systems, allowing users to chain language models, prompt templates, and external tools into complex, automated pipelines. The system distinguishes itself through a highly modular architecture that emphasizes component-based composition and schema-driven data exchange. It supports autonomous agents capable of decomposing complex q
pdfGPT is a retrieval augmented generation application and chatbot designed to analyze PDF documents. It functions as a document analyzer and vector search interface, using large language models to answer questions grounded in the content of uploaded files. The system implements a pipeline that extracts text from PDFs, splits content into overlapping segments, and uses vector-based semantic search to retrieve relevant context. This process allows the application to provide responses with verifiable source citations, including page number references to the original document. The project also
Letta is a framework for building, deploying, and managing autonomous AI agents that maintain persistent state across long-term interactions. It provides a comprehensive suite of primitives for defining agents with configurable personas, modular memory blocks, and tool-use capabilities, enabling them to retain user preferences and conversation history over extended sessions. The platform distinguishes itself through its advanced memory management and orchestration capabilities. It allows agents to autonomously update their own memory, perform retrieval-augmented generation, and coordinate com
This project is a private document analysis tool that enables conversational interaction with PDF files by executing all language model inference and processing entirely on the local machine. By running models directly within the browser or local environment, it ensures that sensitive user data remains offline and inaccessible to external servers or third-party cloud providers. The system utilizes retrieval augmented generation to provide context-aware answers, supported by local document text extraction and vector embedding indexing. This architecture allows for semantic search and informati
Memori is an AI agent memory middleware platform designed to provide persistent, context-aware recall for language models. It functions as a non-intrusive layer that intercepts outbound model requests to automatically capture interaction history and execution traces, ensuring that agents maintain continuity across sessions without requiring modifications to existing application logic. The platform distinguishes itself through a dual-model storage architecture that maintains information as both structured relational primitives for precise fact retrieval and rolling narrative summaries for situ
This project is a web-based user interface and multi-model API gateway for interacting with various large language model providers and local inference services. It functions as a retrieval-augmented generation chatbot for private document questioning, a manager for model fine-tuning, and an autonomous agent framework. The system distinguishes itself by integrating an autonomous assistant mode that uses web search and external tools to solve complex, multi-step tasks without manual prompting. It also features an API gateway capable of rotating multiple authentication keys to balance usage and
PocketFlow is a graph-based framework for designing and executing large language model operations and reasoning patterns. It serves as an orchestrator for building goal-oriented autonomous agents, multi-agent systems, and retrieval-augmented generation pipelines. The system is distinguished by its ability to coordinate autonomous AI agents that use shared memory and tools to solve complex goals, supported by a structured output engine that enforces schema-consistent responses. It utilizes graph-based workflow orchestration to manage sequences of model operations and supports supervisor-based
llm-universe is a structured learning resource and technical guide focused on the development of large language model applications. It serves as a curriculum for mastering model orchestration, the creation of autonomous conversational agents, and the implementation of retrieval-augmented generation systems. The project provides detailed instructions on connecting model APIs with memory and tools to create execution chains. It specifically covers the construction of retrieval pipelines, including the process of cleaning raw documents, generating embeddings, and integrating vector databases to
ruby_llm is an LLM integration framework and AI agent orchestrator designed to connect applications to multiple large language model providers through a unified interface. It serves as a toolkit for building autonomous assistants with custom personas, managing structured output via JSON schemas, and implementing vector embedding engines for semantic search. The project distinguishes itself as an observability suite and multimodal toolkit. It provides specialized capabilities for tracking token usage, calculating model costs, and tracing workflows via OpenTelemetry, while supporting the proces