30 open-source projects similar to llmware-ai/llmware, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Llmware alternative.
The Gemini Cookbook is a comprehensive collection of implementation patterns, code samples, and development guides designed for building applications with Google Gemini models. It serves as a central resource for developers to integrate multimodal generative artificial intelligence into their software, providing the necessary frameworks to manage model interactions, stateful workflows, and structured data extraction. The repository distinguishes itself by offering specialized toolkits for autonomous agent orchestration, enabling the construction of agents that can execute code, browse the web
This project is a comprehensive framework for building AI-powered applications, providing a unified toolkit for orchestrating language models, autonomous agents, and interactive user interfaces. It serves as a central library for managing the entire lifecycle of AI interactions, from initial prompt generation and model provider abstraction to complex, multi-step reasoning and tool execution. The framework distinguishes itself through its deep integration with frontend development, specifically by enabling generative user interfaces that render dynamic components directly from model outputs. I
Mastra is an orchestration framework designed for building, deploying, and managing autonomous AI agents and multi-agent systems. It provides a comprehensive suite of primitives for creating resilient AI applications, including durable workflow orchestration, event-driven agent loops, and semantic memory management. By integrating these core components, the platform enables developers to build complex, multi-step processes that can reason about goals and execute tasks without manual intervention. The framework distinguishes itself through its focus on observability and secure, isolated execut
Kilocode is an autonomous engineering platform designed to orchestrate AI agents for complex software development tasks. It functions as a comprehensive system for automating coding, testing, and repository management by integrating directly with your codebase and terminal. The platform provides a unified gateway for model orchestration, allowing for the management of agentic workflows, event-driven automation, and persistent session state across distributed development environments. The platform distinguishes itself through its federated task management and policy-based access control, which
This project is a comprehensive framework for building and managing autonomous agent systems. It provides a unified architecture for orchestrating multi-agent societies, where specialized agents collaborate through roleplay to decompose and solve complex tasks. The system integrates language models with external environments, enabling agents to perform real-world actions through a standardized tool-calling abstraction layer. The framework distinguishes itself through its focus on iterative reasoning and data reliability. It employs automated feedback loops to refine agent outputs and self-eva
localGPT is a private AI knowledge base and retrieval-augmented generation application. It provides a local document indexer, a hybrid search engine, and an inference interface to enable chatting with private documents and managing a self-hosted information repository without sending data to external servers. The system distinguishes itself through a dual-pass verification pipeline that ensures generated answers are grounded in retrieved sources, accompanied by explicit source attribution. It employs a hybrid retrieval approach combining semantic vector search with keyword matching and rerank
Intel XPU LLM Acceleration Library is a toolkit designed to accelerate large language model inference and finetuning on Intel CPUs, GPUs, and NPUs. It provides a distributed inference engine for scaling models across multiple accelerators, a multimodal model runtime for vision and speech tasks, and a low-bit model quantization tool for converting weights into INT4, FP8, and GGUF formats. The project features a parameter-efficient finetuning framework that enables model adaptation using QLoRA and DPO on Intel hardware. It distinguishes itself by providing specialized optimizations for Intel XP
LangChain4j is a framework and library for building applications powered by large language models on the JVM. It provides a unified API for developing AI agents, implementing retrieval augmented generation, and integrating generative AI capabilities into professional software built with frameworks like Spring Boot or Quarkus. The project enables the creation of autonomous agents that can reason through tasks, manage memory, and execute external tools to achieve specific goals. It differentiates itself through a unified model interface that allows developers to switch between multiple model pr
This project is a retrieval-augmented generation pipeline designed for building custom ChatGPT plugins that allow language models to query private or professional documents. It implements a full retrieval workflow, from processing and indexing document chunks to retrieving relevant context for natural language queries. The system distinguishes itself through a hybrid retrieval approach that combines dense vector embeddings with sparse keyword matching, further refined by a two-stage semantic re-ranking process. It includes specialized data privacy tools for screening personally identifiable i
LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The system distinguishes itself through a combination of cloud-native object storage and immutable version tracking, allowing for data time-travel and reproducible AI experiments. It integrates hybrid search capabilities, merging dense vector similarity with BM25 full-text search and SQL-like scalar filters
AutoRAG is an automation layer and optimization tool for retrieval-augmented generation. It provides a framework for measuring pipeline performance through an evaluation system and an automated search strategy that identifies the most effective combinations of retrieval and generation modules. The system distinguishes itself through AutoML-style optimization, using hyperparameter grid searches and automated trials to find the highest performing architectural configuration for a specific dataset. It includes a specialized dataset generator that creates synthetic question-answer pairs and groun
Moondream is a small-scale vision language model designed to reason across images to generate captions and answer natural language questions. It functions as an edge-optimized system capable of performing visual question answering, image captioning, and object detection. The project distinguishes itself through a lightweight architecture designed for local inference on embedded devices, workstations, and air-gapped hardware. It supports the execution of models on local GPUs and Apple Silicon to ensure data privacy and low latency. The system's capabilities include identifying precise object
qmd is a local semantic search engine and RAG knowledge base indexer that functions as a Model Context Protocol server. It converts local documents, markdown files, and codebases into a searchable database to provide retrieval augmented generation capabilities for AI agents. The system exposes its search and retrieval tools via stdio or HTTP. It utilizes local model files for embeddings and reranking, supporting query expansion across multiple languages. The project employs abstract syntax tree based chunking to split source code at function and class boundaries. It implements hybrid vector-
This project is a framework for developing multimodal AI agents that function as programmable participants in real-time communication rooms. It enables the construction of agents that can see, hear, and speak by integrating speech-to-text, large language models, and text-to-speech pipelines to facilitate low-latency, natural conversations. The system is distinguished by its advanced orchestration of real-time media and conversational flow, including support for full-duplex speech, preemptive response generation, and sophisticated interruption management. It further differentiates itself throu
Eino is an AI agent development kit and LLM application framework designed for building autonomous agents and orchestrating complex language model workflows. It serves as a multi-agent orchestration engine and workflow orchestrator, providing a graph-based execution model to route data between models, tools, and retrievers. The framework distinguishes itself through a robust set of multi-agent coordination patterns, including supervisor-led management, sequential flows, and autonomous reasoning loops like ReAct. It features advanced agent execution controls such as active turn preemption, che
PocketFlow is a graph-based framework for designing and executing large language model operations and reasoning patterns. It serves as an orchestrator for building goal-oriented autonomous agents, multi-agent systems, and retrieval-augmented generation pipelines. The system is distinguished by its ability to coordinate autonomous AI agents that use shared memory and tools to solve complex goals, supported by a structured output engine that enforces schema-consistent responses. It utilizes graph-based workflow orchestration to manage sequences of model operations and supports supervisor-based
Langroid is a multi-agent orchestration framework and tool integration suite designed for building complex AI applications. It serves as a multi-modal integration layer that connects diverse local and remote language models with an agentic retrieval-augmented generation system. The project distinguishes itself through a collaborative message-exchange paradigm, allowing specialized agents to delegate tasks hierarchically and coordinate via structured communication. It features an advanced state management system for conversational AI, including the ability to rewind and prune conversation hist
OpenVINO is an AI inference engine and model serving platform designed to execute optimized deep learning models across CPUs, GPUs, and NPUs through a unified API. It includes a model optimization toolkit for converting, quantizing, and compressing models from various frameworks, alongside a specialized generative AI runtime for large language models. The project distinguishes itself through a plugin-based hardware acceleration layer that maps neural network operations to vendor-specific drivers. It features advanced execution mechanisms such as continuous batching, speculative decoding, and
Kotaemon is an orchestration framework designed for building modular, agentic workflows that integrate document processing, retrieval-augmented generation, and multi-step reasoning. It provides a comprehensive platform for developing document-based question answering systems, allowing users to chain language models, prompt templates, and external tools into complex, automated pipelines. The system distinguishes itself through a highly modular architecture that emphasizes component-based composition and schema-driven data exchange. It supports autonomous agents capable of decomposing complex q
Unstructured is an enterprise-grade data orchestration engine designed to transform raw, unstructured files into structured, machine-readable formats. It functions as a comprehensive platform for document ingestion, partitioning, and enrichment, specifically engineered to prepare complex data for retrieval-augmented generation and agentic AI workflows. The platform distinguishes itself through its sophisticated document processing strategies, which combine rule-based extraction with vision-language models to handle diverse file layouts, tables, and images. It provides a modular architecture t
LiveKit is a comprehensive framework for building and orchestrating real-time, multimodal AI agents that interact with users through voice, video, and text. It provides a centralized, event-driven architecture to manage the entire lifecycle of automated participants, from initialization and session state management to graceful shutdown. By utilizing a selective forwarding unit, the platform efficiently routes media streams between participants and agents, ensuring low-latency communication and secure, token-based authentication for all connections. The platform distinguishes itself through it
PydanticAI is a Python framework designed for building production-grade autonomous agents. It provides a unified interface for interacting with diverse language models, enabling developers to construct agents that perform complex tasks through structured data validation, tool execution, and multi-turn conversation management. The library centers on type-safe schema enforcement, ensuring that model inputs and outputs remain consistent and reliable throughout the agent's lifecycle. The framework distinguishes itself through a robust architecture that emphasizes modularity and testability. It ut
This project is a Python framework for building autonomous, event-driven agent systems. It provides a unified runtime for orchestrating multi-agent workflows, managing persistent conversation state, and executing code within secure, isolated sandbox environments. The framework is designed to handle complex task delegation, allowing agents to invoke other agents as tools while maintaining context across multi-turn interactions. The framework distinguishes itself through its deep integration with the Model Context Protocol, enabling agents to connect to external data sources and remote services
ruby_llm is an LLM integration framework and AI agent orchestrator designed to connect applications to multiple large language model providers through a unified interface. It serves as a toolkit for building autonomous assistants with custom personas, managing structured output via JSON schemas, and implementing vector embedding engines for semantic search. The project distinguishes itself as an observability suite and multimodal toolkit. It provides specialized capabilities for tracking token usage, calculating model costs, and tracing workflows via OpenTelemetry, while supporting the proces
This project provides a dockerized AI workflow stack and orchestration templates for deploying a self-hosted AI environment. It establishes a localized infrastructure for building autonomous agents and model chains that process private data on-premises without external cloud dependencies. The environment is designed to support autonomous agent development, allowing models to dynamically select tools, execute shell commands, and interact with local file systems. It includes integrated vector database support to enable retrieval augmented generation and private document analysis. The stack cov
Rig is a framework for building large language model applications, featuring a multi-provider client and a workflow builder for retrieval-augmented generation systems. It serves as an orchestrator for creating autonomous agents that can maintain conversation state and execute complex tasks through custom prompting and plugins. The project provides standardized interfaces for both completion and embedding model providers, allowing for unified request and response patterns across different engines. It also includes a vector database integration layer that defines a common interface for indexing
This project is a container-native runtime designed for building, orchestrating, and executing autonomous AI agents. It provides a framework for managing multi-agent teams and complex workflows by packaging agent configurations as portable container images. By leveraging declarative configuration files, the system allows users to define agent personas, model routing, and tool access without requiring changes to application code. The platform distinguishes itself through its deep integration with container infrastructure, ensuring that agent tasks and external tools run within isolated environ
WeKnora is a multi-tenant retrieval-augmented generation (RAG) knowledge platform and autonomous AI agent framework. It transforms raw documents into queryable knowledge bases and integrates large language models with vector databases to provide grounded AI responses. The system also functions as a Model Context Protocol (MCP) tool server, exposing knowledge search and agentic capabilities to external AI clients. The platform distinguishes itself through an autonomous agent framework that utilizes iterative reasoning, tool calling, and web search to solve multi-step tasks. It implements a sta
This project is a comprehensive retrieval-augmented generation platform designed for building, managing, and deploying knowledge-based AI applications. It provides a unified environment for organizing datasets, configuring conversational chat assistants, and developing autonomous agents that execute multi-step reasoning workflows. By integrating document intelligence with advanced retrieval pipelines, the platform enables the creation of grounded, verifiable responses supported by traceable citations. The platform distinguishes itself through deep document understanding and sophisticated know
Paper-qa is a retrieval augmented generation system designed for question answering and analysis of scientific literature and technical documents. It functions as an LLM-powered research assistant that extracts grounded answers and summaries with citations from a document library. The system utilizes an agentic RAG orchestrator to iteratively refine search queries and gather evidence through multi-step tool calling. It features a multimodal document parser that extracts text, tables, and images from PDFs, alongside a vector-based indexer that embeds and caches document libraries for efficient