# RAG and chat-with-docs

> Search results for `RAG and chat-with-docs` on awesome-repositories.com. 116 total matches; showing the first 50.

Explore on the web: https://awesome-repositories.com/q/rag-and-chat-with-docs

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [this search on awesome-repositories.com](https://awesome-repositories.com/q/rag-and-chat-with-docs).**

## Results

- [flowiseai/flowise](https://awesome-repositories.com/repository/flowiseai-flowise.md) (53,641 ⭐) — Flowise is a low-code platform designed for building and deploying complex language model workflows through a visual, node-based interface. It functions as an orchestrator for autonomous multi-agent systems, allowing users to construct conversational pipelines by connecting language models, memory stores, and external tools on a drag-and-drop canvas.

The platform distinguishes itself through its support for sophisticated agentic patterns, including supervisor-worker delegation and iterative reasoning strategies. Users can design directed acyclic graphs to manage conditional branching, state persistence, and complex task distribution. It also provides a robust framework for retrieval-augmented generation, enabling the creation of self-correcting systems that can index document data and validate information autonomously.

Beyond its visual design capabilities, the project serves as a comprehensive backend for AI applications. It includes a secure credential management layer for third-party API keys, role-based access controls, and a RESTful API that allows for programmatic management of chat sessions, workflows, and assistant configurations.

The application is designed for flexible deployment, supporting containerized environments for consistent operation across local and cloud infrastructure. Detailed documentation and tutorials are available to guide users through the lifecycle of building, testing, and scaling production-ready AI agents.
- [aishwaryanr/awesome-generative-ai-guide](https://awesome-repositories.com/repository/aishwaryanr-awesome-generative-ai-guide.md) (24,755 ⭐) — This project is a community-driven knowledge repository and technical learning resource focused on the field of generative artificial intelligence. It serves as a centralized hub for developers and practitioners to access curated research, tutorials, and foundational concepts necessary for building and deploying modern artificial intelligence applications.

The platform distinguishes itself through a collaborative, distributed contribution model that aggregates diverse learning materials into a structured, searchable knowledge base. It covers a wide range of specialized topics, including retrieval-augmented generation, large language model training, fine-tuning techniques, and agentic workflows. Beyond technical skill development, the repository functions as a professional development hub, offering interview preparation resources and guidance for those pursuing careers in the artificial intelligence industry.

The content is organized through a hierarchical taxonomy, allowing users to navigate complex subjects such as system evaluation, multimodal models, and security tools. The repository provides access to comprehensive code notebooks and structured tutorials, all maintained as static documentation within a version control system to ensure accessibility and ease of discovery.
- [infiniflow/ragflow](https://awesome-repositories.com/repository/infiniflow-ragflow.md) (82,922 ⭐) — This project is a comprehensive retrieval-augmented generation platform designed for building, managing, and deploying knowledge-based AI applications. It provides a unified environment for organizing datasets, configuring conversational chat assistants, and developing autonomous agents that execute multi-step reasoning workflows. By integrating document intelligence with advanced retrieval pipelines, the platform enables the creation of grounded, verifiable responses supported by traceable citations.

The platform distinguishes itself through deep document understanding and sophisticated knowledge orchestration. It supports complex document parsing, including the extraction of tables and images, and utilizes graph-based indexing to enhance reasoning over large document collections. Users can configure multiple recall strategies and fused re-ranking to optimize retrieval accuracy, while the system maintains context through multi-turn dialogue management and flexible tool-use frameworks.

The architecture is built on a modular, containerized microservice foundation that supports both local inference engines and external language model APIs. It includes asynchronous task processing for document ingestion and indexing, ensuring system responsiveness during heavy workloads. The platform also provides a standardized interface for model abstraction, allowing for seamless integration with existing language model ecosystems.

Developers can interact with the platform through a comprehensive suite of RESTful endpoints and Python client libraries, which cover the full lifecycle of agents, datasets, and knowledge graphs. The system is designed for flexible deployment, offering configurable environment settings and support for custom containerized environments to facilitate local development and infrastructure portability.
- [jamwithai/production-agentic-rag-course](https://awesome-repositories.com/repository/jamwithai-production-agentic-rag-course.md) (6,972 ⭐) — This project is an educational course and technical blueprint for building production-ready retrieval-augmented generation systems. It provides a curriculum and implementation strategies for designing agentic workflows, containerized AI infrastructure, and retrieval pipelines using large language models.

The materials focus on agentic design patterns, utilizing state-based decision nodes to rewrite queries and grade retrieved documents. It differentiates its approach by providing a deployment framework for managing databases, search engines, and API services through container orchestration.

The project covers a broad range of architectural capabilities, including hybrid search with reciprocal rank fusion, OCR-based document parsing for PDF ingestion, and input-validation guardrails to prevent hallucinations. It also addresses operational requirements such as distributed request tracing, automatic query caching, and server-sent event streaming for real-time responses.
- [datawhalechina/llm-cookbook](https://awesome-repositories.com/repository/datawhalechina-llm-cookbook.md) (24,263 ⭐) — This repository is a comprehensive set of tutorials and examples for building software powered by large language models. It serves as an application development guide and a prompt engineering framework, providing instructional content for integrating model logic with user interfaces and external data sources.

The project provides technical walkthroughs for specialized workflows, including the implementation of retrieval augmented generation using vector databases and semantic search. It includes guidance on adapting pre-trained model weights through fine-tuning with private datasets and the orchestration of autonomous agents that connect language models to external tools and APIs.

The material covers a broad range of AI development capabilities, including prompt optimization for summarization and inference, the deployment of generative AI interfaces, and the systematic evaluation of model outputs for quality and consistency.
- [qnguyen3/chat-with-mlx](https://awesome-repositories.com/repository/qnguyen3-chat-with-mlx.md) (1,595 ⭐) — An all-in-one LLMs Chat UI for Apple Silicon Mac using MLX Framework.
- [milvus-io/milvus](https://awesome-repositories.com/repository/milvus-io-milvus.md) (44,804 ⭐) — Milvus is a specialized vector database engine designed for the indexing, management, and high-speed similarity retrieval of high-dimensional vector embeddings. It functions as a similarity search engine capable of identifying nearest neighbors within large-scale vector spaces, supporting the storage and retrieval of billions of data points while maintaining consistent performance.

The system utilizes a distributed architecture that decouples storage, query, and coordination into independent services, allowing for horizontal scaling across clusters. It employs a global indexing mechanism that builds specialized data structures across immutable, independently indexed segments. This design, combined with a shared-storage decoupled model, enables compute and storage resources to scale independently in cloud environments, while a log-based persistence layer ensures data durability and state recovery.

The platform supports a wide range of data retrieval patterns, including retrieval-augmented generation, hybrid search, and multimodal data retrieval for text, images, and graphs. Deployment options range from lightweight local instances for rapid prototyping to robust standalone setups and fully managed distributed clusters. Documentation includes sizing tools to assist in estimating hardware requirements based on specific data volumes and operational patterns.
- [camel-ai/camel](https://awesome-repositories.com/repository/camel-ai-camel.md) (17,253 ⭐) — This project is a comprehensive framework for building and managing autonomous agent systems. It provides a unified architecture for orchestrating multi-agent societies, where specialized agents collaborate through roleplay to decompose and solve complex tasks. The system integrates language models with external environments, enabling agents to perform real-world actions through a standardized tool-calling abstraction layer.

The framework distinguishes itself through its focus on iterative reasoning and data reliability. It employs automated feedback loops to refine agent outputs and self-evaluate reasoning traces, ensuring high-quality results. To maintain operational integrity, the system enforces schema-based output parsing for reliable workflow integration and utilizes sandboxed environments for secure, isolated code execution.

Beyond its core orchestration capabilities, the project includes a suite of utilities for retrieval-augmented generation and synthetic data production. It supports persistent memory management via vector-based context retrieval and provides extensive tooling for web automation, API integration, and human-in-the-loop oversight. The platform is designed to be model-agnostic, offering a consistent interface for interacting with a wide range of proprietary and open-source language models.
- [sled-group/chat-with-nerf](https://awesome-repositories.com/repository/sled-group-chat-with-nerf.md) (0 ⭐) — Open-Vocabulary 3D Localization. Locate anything with natural language dialog! - Interactive Grounding. Humans will be able to chat with an agent to localize novel objects.
- [datawhalechina/prompt-engineering-for-developers](https://awesome-repositories.com/repository/datawhalechina-prompt-engineering-for-developers.md) (24,267 ⭐) — This project is a technical curriculum and development guide focused on large language model prompt engineering, fine-tuning, and the creation of retrieval augmented generation applications. It serves as a comprehensive resource for developers to master crafting precise instructions and textual patterns to improve the quality and predictability of model outputs.

The material covers the end-to-end workflow of adapting open-source models to specific datasets and integrating language models with vector databases to generate responses based on private information. It also provides a systematic approach to tracking and debugging generative AI systems through benchmarking and output evaluation.

Beyond prompt design, the guides address AI application orchestration by chaining model calls and logic steps into complex workflows. The scope includes implementing semantic search and managing the full lifecycle of AI application development from initial prompt construction to final model evaluation.

The project is implemented as a series of Jupyter Notebooks.
- [tony-xlh/chat-with-scanned-documents](https://awesome-repositories.com/repository/tony-xlh-chat-with-scanned-documents.md) (6 ⭐) — A demo chatting with documents scanned with Dynamic Web TWAIN
- [letta-ai/letta](https://awesome-repositories.com/repository/letta-ai-letta.md) (21,168 ⭐) — Letta is a framework for building, deploying, and managing autonomous AI agents that maintain persistent state across long-term interactions. It provides a comprehensive suite of primitives for defining agents with configurable personas, modular memory blocks, and tool-use capabilities, enabling them to retain user preferences and conversation history over extended sessions.

The platform distinguishes itself through its advanced memory management and orchestration capabilities. It allows agents to autonomously update their own memory, perform retrieval-augmented generation, and coordinate complex multi-agent workflows through hierarchical delegation. By supporting both local and remote execution environments, it enables developers to build stateful agents that can be managed programmatically via API or integrated into existing automation pipelines.

The system includes a robust set of administrative and security features, such as human-in-the-loop approval for tool execution, multi-tenant identity management, and automated performance evaluation suites. These tools allow for the creation of reproducible agent blueprints, version-controlled deployments, and detailed observability into agent reasoning and memory integrity.

The project is distributed as a Python-based framework, providing official SDKs and a command-line interface to facilitate integration into development workflows and production environments.
- [cogentapps/chat-with-gpt](https://awesome-repositories.com/repository/cogentapps-chat-with-gpt.md) (2,356 ⭐) — An open-source ChatGPT app with a voice
- [docling-project/docling](https://awesome-repositories.com/repository/docling-project-docling.md) (61,674 ⭐) — Docling is a modular framework designed for document parsing, layout analysis, and structured data extraction. It transforms unstructured files and web content into a unified, hierarchical data model that preserves the spatial and semantic relationships between text, tables, images, and layout elements. By normalizing diverse input formats into a consistent internal representation, the library enables uniform processing across various document types.

The project distinguishes itself through a schema-driven approach that maps document regions to strongly-typed objects, ensuring data accuracy through validation against predefined templates. Its pipeline-based architecture supports pluggable processing backends, allowing for the dynamic integration of specialized engines for optical character recognition and complex visual layout analysis. Users can control parsing behavior and extraction parameters through declarative configuration files, facilitating integration into automated workflows and server-based architectures.

The library provides both a programmatic interface and a command-line toolkit to support automated document processing and format conversion. It utilizes optional dependency management to allow for modular installation of specific features, such as media rendering or advanced processing capabilities, depending on the requirements of the application.
- [liaokongvfx/langchain-chinese-getting-started-guide](https://awesome-repositories.com/repository/liaokongvfx-langchain-chinese-getting-started-guide.md) (9,039 ⭐) — This project is a collection of tutorials and guides for building large language model applications using the LangChain framework, written in Chinese. It serves as a learning resource for developing software that integrates language models with memory and chain-based logic.

The resource provides specific walkthroughs for implementing retrieval augmented generation systems using vector stores and document loaders. It includes guides on creating autonomous agents that dynamically select and execute external tools, as well as tutorials for translating plain text queries into executable database commands.

The guides cover a broad range of capabilities, including the construction of custom knowledge bases, the implementation of conversational memory, and the execution of natural language data querying. It also addresses data processing tasks such as loading documents from diverse sources, splitting text for token limits, and extracting structured data from the web.
- [doc-detective/doc-detective](https://awesome-repositories.com/repository/doc-detective-doc-detective.md) (0 ⭐) — Doc Detective is doc content testing framework that makes it easy to keep your docs accurate and up-to-date. You write tests, and Doc Detective runs them directly against your product to make sure your docs match your user experience. Whether it’s a UI-based process or a series of API calls, Doc…
- [kestra-io/kestra](https://awesome-repositories.com/repository/kestra-io-kestra.md) (27,073 ⭐) — Kestra is a declarative workflow orchestrator designed to manage complex task dependencies and automated processes through versioned configuration files. It functions as a distributed platform that decouples task scheduling from execution by offloading computational workloads to a fleet of worker nodes. The system uses a reactive, event-driven engine to initiate workflows automatically in response to external signals, webhooks, schedules, or file system changes.

The platform distinguishes itself through a modular plugin architecture that allows for the integration of custom tasks and external services. It provides an AI-native development environment that incorporates language models to generate, refine, and execute automation logic using natural language prompts. To support diverse operational needs, Kestra implements a multi-tenant execution model that isolates resources, data, and access controls for different teams within a single shared instance.

The system covers a broad range of operational capabilities, including robust state management, granular role-based access control, and comprehensive system auditing. It offers extensive tools for workflow logic, such as conditional branching, parallel task execution, and iterative processing, alongside built-in resilience features like automated retries and failure policies. Users can manage these configurations through a centralized interface that supports visual editing and real-time monitoring of execution status.
- [exyte/chat](https://awesome-repositories.com/repository/exyte-chat.md) (1,797 ⭐) — A SwiftUI Chat UI framework with fully customizable message cells and a built-in media picker
- [quivrhq/quivr](https://awesome-repositories.com/repository/quivrhq-quivr.md) (39,165 ⭐) — Quivr is a retrieval-augmented generation platform designed to transform raw documents into searchable knowledge bases. It functions as a centralized environment where users can ingest files, index them into vector databases, and interact with language models to receive contextually relevant, data-backed responses.

The platform distinguishes itself through an agentic workflow orchestrator that sequences retrieval tasks, tool execution, and model interactions to resolve complex, multi-step queries. This engine is entirely configuration-driven, allowing users to define document ingestion, chunking parameters, and workflow node sequences through structured schemas. By maintaining a unified knowledge management interface, the system tracks chat history alongside file storage, ensuring that interactions remain context-aware across diverse local and remote backends.

Beyond its core orchestration, the system provides a comprehensive pipeline for document processing, including parsing for various file formats and asynchronous task execution to maintain responsiveness during data ingestion. It supports the development of specialized chatbots, including voice-enabled interfaces, by integrating speech-to-text and text-to-speech capabilities with its underlying retrieval systems.

The project utilizes strict base classes to enforce configuration integrity, ensuring consistent data processing across all application settings.
- [zhayujie/chatgpt-on-wechat](https://awesome-repositories.com/repository/zhayujie-chatgpt-on-wechat.md) (45,353 ⭐) — This project is an autonomous agent framework designed to integrate large language models with popular messaging platforms. It functions as a middleware platform that enables automated, multimodal interactions by decomposing complex user goals into sequential plans, executing them through external tools, and maintaining persistent context across sessions.

The framework distinguishes itself through a modular skill architecture and a hybrid memory system. Users can extend system capabilities by installing custom logic modules from community hubs or generating them through natural language. The memory system combines vector-based similarity search with traditional keyword indexing to retrieve relevant historical context, while a dedicated web console allows for the management of these memory files, system logs, and active messaging channels.

The system supports a broad range of operational capabilities, including model-agnostic task routing, automated knowledge organization, and real-time reasoning visualization. It provides comprehensive administrative control through both terminal-based commands and slash-prefixed chat inputs, allowing for the management of runtime configurations, skill installations, and background processes.

The project is configured via centralized files and provides secure storage for API keys and environment secrets. It is designed for deployment as a persistent service, with support for cross-platform messaging and automated task scheduling.
- [mistralai/mistral-inference](https://awesome-repositories.com/repository/mistralai-mistral-inference.md) (10,819 ⭐) — Mistral Inference is a library for running Mistral large language models on a GPU, generating text from prompts with token streaming. It loads pretrained model weights from local disk or a remote registry into GPU memory, then produces output tokens one by one for real-time display in interactive applications.

The library supports multimodal prompts that accept image URLs alongside text, enabling visual description and reasoning. It includes content safety guardrails that scan generated text against predefined policies to block or flag policy violations. For structured interactions, it provides function-call prompt formatting so the model outputs a tool call instead of free text, and it offers code completion that fills in a missing middle segment given a prefix and suffix.

Beyond basic text generation, Mistral Inference provides an interactive chat interface for conversational loops, and it can be packaged into a Docker container for serving via a vLLM-compatible API endpoint. The library handles model loading from disk or registry, GPU-accelerated tensor computation, and streaming output through a generator interface.
- [lzyy/chat](https://awesome-repositories.com/repository/lzyy-chat.md) (326 ⭐) — a live chat built with python(flask + gevent + apscheduler) + redis
- [agentscope-ai/agentscope](https://awesome-repositories.com/repository/agentscope-ai-agentscope.md) (26,895 ⭐) — Agentscope is a comprehensive toolkit for developing and orchestrating autonomous multi-agent systems. It provides a unified framework for building agents that can reason, execute tools, and manage memory, enabling the creation of complex, collaborative workflows where multiple specialized agents interact to solve multi-step objectives.

The platform distinguishes itself through a robust orchestration engine that supports both sequential and concurrent agent pipelines. It utilizes a centralized event bus for real-time telemetry, allowing developers to track agent reasoning, tool usage, and system performance. By employing a provider-agnostic interface, the framework abstracts diverse language model APIs, while its middleware-based execution hooks allow for the injection of custom logic to intercept, validate, or transform agent behavior at runtime.

Beyond core orchestration, the project includes extensive capabilities for tool integration, including dynamic schema parsing from function docstrings and support for secure, sandboxed code execution. It also features built-in support for retrieval-augmented generation, long-term memory management, and systematic performance evaluation, providing a complete environment for the lifecycle management of agentic applications.

The library is designed for extensibility, offering base classes for custom memory backends, prompt formats, and tool providers. It is distributed as a Python package, with documentation and interactive development tools available to assist in prototyping and managing multi-agent projects.
- [tabbyml/tabby](https://awesome-repositories.com/repository/tabbyml-tabby.md) (33,605 ⭐) — Tabby is a self-hosted AI coding assistant designed to provide real-time code completion and interactive chat capabilities within development environments. By functioning as a private server application, it allows teams to maintain control over their infrastructure and data while integrating intelligent code generation directly into their existing workflows.

The platform distinguishes itself through its repository-aware knowledge retrieval and multi-model orchestration. It indexes local and remote source code repositories and technical documentation into a searchable vector-based knowledge graph, enabling the assistant to provide context-specific answers and code suggestions. The system manages distinct pipelines for completion, chat, and embedding models, allowing users to tune performance and hardware utilization based on specific task requirements.

The architecture supports scalable, containerized deployment, enabling consistent performance across local and cloud environments. It utilizes declarative configuration to manage infrastructure and service replicas, while integrating with development environments through standard messaging interfaces. Users can configure specific models for different tasks, ensuring compatibility with performance benchmarks and hardware constraints.
- [raudaschl/rag-fusion](https://awesome-repositories.com/repository/raudaschl-rag-fusion.md) (940 ⭐) — RAG-Fusion: multi-query generation + Reciprocal Rank Fusion for better retrieval-augmented generation. Includes evaluation harness with NFCorpus/BEIR.
- [mrrezaeiuoft/amg-rag](https://awesome-repositories.com/repository/mrrezaeiuoft-amg-rag.md) (0 ⭐) — AMG-RAG (Agentic Medical Graph-RAG) is a comprehensive framework that automates the construction and continuous updating of Medical Knowledge Graphs (MKGs), integrates reasoning, and retrieves current external evidence for medical Question Answering (QA). Our approach addresses the challenge of…
- [xiaolincoder/cs-base](https://awesome-repositories.com/repository/xiaolincoder-cs-base.md) (18,024 ⭐) — CS-Base is a comprehensive educational platform and technical repository designed to support software engineers in mastering backend architecture, artificial intelligence engineering, and career development. It functions as a centralized knowledge hub that combines illustrated theoretical tutorials with practical, project-based learning to bridge the gap between foundational computer science concepts and professional industry requirements.

The project distinguishes itself by integrating a robust career mentorship framework with advanced AI engineering resources. It provides users with tools for resume optimization, interview simulation, and personalized study planning, while simultaneously offering deep-dive technical curriculum on topics such as retrieval-augmented generation, autonomous agent orchestration, and distributed system design. By synthesizing these domains, the platform enables developers to build production-grade applications while preparing for high-stakes technical hiring processes.

Beyond its educational focus, the repository serves as a technical reference for implementing complex software patterns. It covers a broad capability surface including concurrency management, memory optimization, and secure system architecture, providing structured guidance on how to apply these principles within modern development workflows.

The project is documented through a collection of technical guides, curated question banks, and project templates available directly within the repository.
- [microsoft/generative-ai-for-beginners](https://awesome-repositories.com/repository/microsoft-generative-ai-for-beginners.md) (112,045 ⭐) — This project is a comprehensive, open-source educational curriculum designed to guide developers through the mastery of generative artificial intelligence. It provides a structured learning path that covers foundational concepts, prompt engineering, and the practical application of large language models. The repository serves as a central hub for skill acquisition, offering sequential modules that progress from basic model mechanics to advanced architectural patterns.

The curriculum distinguishes itself by focusing on the end-to-end lifecycle of intelligent software, including the implementation of retrieval-augmented generation and agentic workflow orchestration. It provides technical guidance on integrating diverse models—ranging from open-source options to cloud-based services—while emphasizing responsible development through systematic safety guardrails and ethical design practices. Learners are equipped to build functional applications, such as conversational interfaces, semantic search tools, and automated content generators, using standardized interfaces and modern development techniques.

Beyond core model implementation, the resource covers operational practices for monitoring and maintaining AI systems in production. It includes practical modules on fine-tuning, vector-based indexing, and designing intuitive user experiences for intelligent systems. The repository is structured to support developers through every stage of the process, from initial environment configuration and dependency management to deployment readiness and troubleshooting.
- [apache/apisix](https://awesome-repositories.com/repository/apache-apisix.md) (16,767 ⭐) — This project is a high-performance, distributed API gateway designed to manage, secure, and observe traffic for microservices, serverless functions, and artificial intelligence model providers. It functions as a dynamic service proxy and cloud-native ingress controller, centralizing policy enforcement and traffic routing through a unified configuration interface that synchronizes state across multiple nodes in real time.

The platform distinguishes itself through a highly extensible architecture that utilizes a high-performance scripting engine to execute modular logic directly within the request lifecycle. It provides specialized capabilities for modern AI workflows, including model request proxying, token-based budget enforcement, content moderation, and agentic workflow tracing. Furthermore, it supports complex multi-protocol environments by bridging diverse communication standards, including gRPC and various binary protocols, without requiring additional sidecar processes.

Beyond its core proxying functions, the gateway offers a comprehensive suite of traffic management and security tools. It handles authentication and authorization through multiple strategies, including token validation and identity provider integration, while maintaining granular control over TLS policies and secret management. The system also provides robust observability through distributed tracing, metrics exporting, and detailed request logging, ensuring visibility into both standard API traffic and complex AI-driven interactions.

The software is designed for containerized environments and can be deployed using standard container images, with full support for translating Kubernetes ingress resources into live routing rules.
- [advanced-chat/vue-advanced-chat](https://awesome-repositories.com/repository/advanced-chat-vue-advanced-chat.md) (2,062 ⭐) — A beautiful chat rooms web component compatible with all Javascript frameworks
- [itzcrazykns/perplexica](https://awesome-repositories.com/repository/itzcrazykns-perplexica.md) (35,308 ⭐) — Perplexica is an AI-powered search engine that synthesizes real-time web results into coherent, cited summaries. By utilizing large language models and retrieval augmentation, the platform gathers information from the live internet to provide accurate answers to complex user queries, ensuring that all generated content includes verifiable source citations.

The project functions as a search orchestration platform that aggregates data from multiple sources and exposes these capabilities through standard endpoints. This allows for automated data integration, enabling external software to retrieve AI-generated insights and summaries for custom workflows. The system is designed for portability, utilizing container-based orchestration to ensure consistent execution across diverse hosting environments.

Beyond its core search functionality, the platform supports automated infrastructure provisioning and network configuration to facilitate deployment. It includes features for exposing local service instances to external networks, which supports collaborative testing and remote access to private application instances.
- [pageman/sutskever-30-implementations](https://awesome-repositories.com/repository/pageman-sutskever-30-implementations.md) (3,148 ⭐) — This project is a collection of deep learning research implementations and a reproduction kit designed to translate theoretical AI papers into working code. It provides a library of neural network architectures and reference implementations for reproducing seminal research concepts through interactive notebooks.

The repository distinguishes itself through the implementation of AI theory and scaling laws, covering complexity dynamics, information theory, and the simulation of universal AI agents. It also includes a benchmarking suite for synthetic reasoning, allowing for the evaluation of model performance and the analysis of scaling laws across compute and parameter counts.

The architectural coverage spans a wide range of models, including memory-augmented networks, Transformers, Graph Neural Networks, and convolutional vision pipelines. It implements specialized systems such as retrieval augmented generation and sequence-to-sequence models, supported by utilities for model parallelism, network compression, and training optimization.

The project provides a practical reference for implementing these advanced architectures using a tensor-based framework.
- [intellabs/rag-fit](https://awesome-repositories.com/repository/intellabs-rag-fit.md) (770 ⭐) — Framework for enhancing LLMs for RAG tasks using fine-tuning.
- [hkuds/rag-anything](https://awesome-repositories.com/repository/hkuds-rag-anything.md) (21,372 ⭐) — RAG-Anything is a retrieval-augmented generation framework designed to index diverse document formats and perform semantic search using local machine learning models. It functions as a local multimodal data processor, extracting and organizing information from various file types into a unified knowledge base to facilitate private document analysis.

The system distinguishes itself through its high-throughput ingestion engine, which processes large batches of documents into searchable vector embeddings. By executing machine learning models directly on local hardware, the framework ensures that sensitive data remains private and independent of external cloud services.

The platform supports comprehensive data management, including the ability to parse multimodal information and assemble context-aware windows for precise retrieval. It provides a structured pipeline for indexing high volumes of data and performing semantic similarity searches to generate accurate, context-specific responses.
- [open-webui/open-webui](https://awesome-repositories.com/repository/open-webui-open-webui.md) (142,694 ⭐) — Open WebUI is a self-hosted, web-based platform designed for interacting with local and remote artificial intelligence models. It functions as a unified interface and orchestration suite, enabling users to build, deploy, and manage specialized AI agents equipped with custom instructions, external tool access, and private knowledge bases.

The platform distinguishes itself through a modular architecture that supports complex AI workflows. It features a plugin-based framework for custom logic and pipeline-based request processing, allowing developers to filter or transform data streams before they reach a model. For enterprise environments, it provides centralized model management, role-based access control, and integration with standard identity providers like LDAP and SSO. It also includes sandboxed code execution and vector-database-based retrieval, enabling models to perform secure computations and semantic searches across private document collections.

Beyond its core chat capabilities, the platform offers extensive administrative and operational tools. It supports multi-node deployments, horizontal scaling, and comprehensive system observability to ensure reliability in production settings. Users can further customize the interface, manage API access via personal tokens, and utilize persistent workspaces for collaborative knowledge management.

The software is packaged for container-orchestrated deployment, allowing for consistent execution across diverse cloud and local infrastructure.
- [basedhardware/omi](https://awesome-repositories.com/repository/basedhardware-omi.md) (12,869 ⭐) — Omi is an open-source wearable AI platform that captures audio and screen data to provide real-time conversational assistance and memory. It integrates a wearable hardware development kit with a vector memory database and large language model capabilities to create a persistent digital record of user interactions.

The platform is distinguished by its BLE audio streaming pipeline, which transmits raw audio from wearable hardware for real-time transcription and speaker identification. It utilizes a plugin-based agent tool framework that allows AI assistants to autonomously invoke custom functions and interact with external services.

The system covers broad capability areas including semantic memory retrieval, voice-driven workflow automation, and multimodal activity capture. It manages the full lifecycle of AI interactions through automated conversation summarization, persona emulation, and the programmatic management of memories and action items.

The project provides a choice between self-hosting the backend or using a managed cloud service, with available SDKs for building third-party applications.
- [40ants/cl-project-with-docs](https://awesome-repositories.com/repository/40ants-cl-project-with-docs.md) (5 ⭐) — Common Lisp project skeleton generator which uses Sphinx and reStructured text to render nice and readable HTML documentation.
- [cmavro/gnn-rag](https://awesome-repositories.com/repository/cmavro-gnn-rag.md) (0 ⭐) — This is the code for GNN-RAG: Graph Neural Retrieval for Large Language Modeling Reasoning.
- [onyx-dot-app/onyx](https://awesome-repositories.com/repository/onyx-dot-app-onyx.md) (17,491 ⭐) — Onyx is an enterprise-grade AI platform designed for knowledge management, search, and autonomous agent orchestration. It functions as a centralized system that aggregates unstructured organizational data, enabling secure, context-aware retrieval and interaction across internal documents and communication history. By integrating retrieval-augmented generation with multi-model orchestration, the platform provides a unified interface for teams to query internal knowledge bases and execute complex, multi-step business processes.

The platform distinguishes itself through a focus on private infrastructure and strict security, allowing organizations to deploy services on-premise or in isolated containers to meet data residency requirements. It features a modular data connector framework that indexes information from disparate third-party applications, ensuring that all search and chat interactions adhere to existing role-based access controls. Furthermore, the system supports agentic workflows that decompose complex research requests into parallel sub-queries, synthesizing evidence-based responses from both internal data and live web research.

Beyond its core search and retrieval capabilities, the platform includes tools for managing the full lifecycle of AI integration. This includes administrative oversight for team collaboration, benchmarking and cost analysis for various language models, and the ability to configure specialized agents with unique instructions and tool access. Users can interact with these capabilities through a web interface, integrated messaging platforms, or a dedicated desktop application.
- [livekit/livekit](https://awesome-repositories.com/repository/livekit-livekit.md) (19,358 ⭐) — LiveKit is a comprehensive framework for building and orchestrating real-time, multimodal AI agents that interact with users through voice, video, and text. It provides a centralized, event-driven architecture to manage the entire lifecycle of automated participants, from initialization and session state management to graceful shutdown. By utilizing a selective forwarding unit, the platform efficiently routes media streams between participants and agents, ensuring low-latency communication and secure, token-based authentication for all connections.

The platform distinguishes itself through its modular pipeline-based media processing, which chains specialized speech-to-text, language, and text-to-speech services into cohesive workflows. It includes advanced capabilities for real-time voice activity detection, enabling natural turn-taking and interruption handling, alongside remote procedure call tooling that allows agents to execute external functions or access local resources during a conversation. Developers can further extend these interactions by integrating photorealistic virtual avatars that synchronize visual expressions with the agent's audio output.

Beyond core conversational logic, the system offers extensive support for telephony integration, allowing agents to connect to public networks via SIP for inbound and outbound calling. It provides a robust suite of observability and monitoring tools to track agent performance, connection quality, and session events, ensuring reliability in production environments. The platform also includes specialized utilities for task automation, such as capturing and validating structured user data, and supports multi-step workflow orchestration to handle complex, context-aware interactions.

The project provides a command-line interface for scaffolding, deploying, and testing agent applications, with documentation available in machine-readable formats to assist in development.
- [mervinpraison/praisonai](https://awesome-repositories.com/repository/mervinpraison-praisonai.md) (5,592 ⭐) — PraisonAI is an autonomous AI agent platform that coordinates multiple LLM-powered agents for research, planning, and execution of complex workflows. It functions as a multi-agent orchestration framework, a workflow builder, and a Model Context Protocol server, while also providing retrieval-augmented generation through vector knowledge bases. Agents can interact via CLI, web, or standardized protocols with sandboxed code execution.

The platform distinguishes itself with a rich set of agent communication protocols, including A2A, REST, WebSocket, voice and telephony integration, and MCP, allowing agents to be exposed as services and connect to external systems. Comprehensive safety governance enforces human-in-the-loop approval for destructive actions, sandboxed code execution, policy-based tool permissions, and output validation. Memory and state management are advanced, with persistent memory across sessions, checkpoints, per-user isolation, and support for multiple backends including SQLite, PostgreSQL, Redis, MongoDB, Weaviate, and vector stores. Multi-agent orchestration includes planning, delegation, sequential and parallel execution, conditional branching, and compensation patterns for handling partial failures.

Broader capabilities cover agent monitoring with cost tracking, telemetry, and live visualization, as well as testing and evaluation tools for debugging, replay, and batch assessment. Extensibility is provided through custom tools, MCP server connections, and a recipe management system for reusable workflows. Content processing includes image analysis and generation, OCR, speech synthesis and transcription, video analysis, and data analysis. Deployment options span REST APIs, messaging platforms, Docker and Kubernetes, and background job execution. Search and knowledge retrieval incorporate hybrid search, query rewriting, deep research, and web research with citations.

Agents and workflows are defined in YAML and orchestrated through a command-line interface that also supports interactive coding, real-time chat, and voice interactions.
- [igorantun/node-chat](https://awesome-repositories.com/repository/igorantun-node-chat.md) (768 ⭐) — :speech_balloon: Chat application built with NodeJS and Material Design
- [mlflow/mlflow](https://awesome-repositories.com/repository/mlflow-mlflow.md) (26,554 ⭐)
- [iovisor/bpf-docs](https://awesome-repositories.com/repository/iovisor-bpf-docs.md) (1,012 ⭐) — Presentations and docs
- [danny-avila/librechat](https://awesome-repositories.com/repository/danny-avila-librechat.md) (39,276 ⭐) — LibreChat is an artificial intelligence orchestration platform that provides a unified interface for interacting with multiple language models. It functions as a centralized workspace where users can switch between different intelligence engines, manage complex conversational workflows, and maintain persistent memory across sessions through a vector-database-backed storage system.

The platform distinguishes itself through an extensible agent framework that supports autonomous task execution and the integration of external tools. It features a secure, containerized environment for executing code snippets and dynamically renders interactive artifacts, such as visual diagrams and functional user interface components, directly within the chat window. These capabilities allow for hands-on manipulation of generated content and the processing of multi-step tasks.

Beyond core conversational features, the platform includes tools for dynamic knowledge retrieval, enabling the assistant to fetch and rerank live web data to provide up-to-date information. It also incorporates enterprise-grade security measures, including server-side session management and support for standard authentication protocols like OAuth and SAML, to ensure controlled access in multi-user environments.
- [gitterhq/docs](https://awesome-repositories.com/repository/gitterhq-docs.md) (58 ⭐) — Moved to https://gitlab.com/gitlab-org/gitter/docs
- [mastra-ai/mastra](https://awesome-repositories.com/repository/mastra-ai-mastra.md) (21,221 ⭐) — Mastra is an orchestration framework designed for building, deploying, and managing autonomous AI agents and multi-agent systems. It provides a comprehensive suite of primitives for creating resilient AI applications, including durable workflow orchestration, event-driven agent loops, and semantic memory management. By integrating these core components, the platform enables developers to build complex, multi-step processes that can reason about goals and execute tasks without manual intervention.

The framework distinguishes itself through its focus on observability and secure, isolated execution. It features a built-in telemetry pipeline that captures structured execution traces, logs, and performance metrics, allowing for real-time debugging and evaluation of agent behavior. Furthermore, it utilizes sandboxed environments to isolate code execution and filesystem operations, ensuring that agent interactions remain secure and reproducible.

Mastra covers a broad capability surface, including multi-agent delegation hierarchies, schema-validated tool execution, and real-time voice interaction. It supports advanced orchestration patterns such as human-in-the-loop approvals, persistent state management for long-running workflows, and retrieval-augmented generation using vector-based semantic memory. These features are designed to work together to support the entire lifecycle of AI-powered applications, from initial development and testing to production deployment.

The project is built for TypeScript environments and provides a modular architecture that integrates with existing web stacks and infrastructure. It includes a client SDK for interacting with remote agents and supports various authentication providers to secure API endpoints and agent resources.
- [ufund-me/qbot](https://awesome-repositories.com/repository/ufund-me-qbot.md) (17,659 ⭐) — Qbot is a multi-purpose platform designed to support automated recruitment, quantitative trading, and distributed service orchestration. It functions as a comprehensive framework that integrates artificial intelligence into specialized workflows, enabling users to build and deploy systems for candidate screening, financial strategy execution, and context-aware knowledge retrieval.

The platform distinguishes itself through a modular architecture that combines high-performance distributed communication with domain-specific automation. It provides a robust foundation for managing microservices through service discovery, load balancing, and annotation-driven dependency injection, while simultaneously offering specialized engines for parsing resumes, conducting simulated voice interviews, and executing automated investment strategies.

Beyond its core engines, the system includes extensive capabilities for data management and infrastructure orchestration. It supports retrieval-augmented generation by processing documents into vector stores for semantic search, manages complex financial data pipelines, and ensures system reliability through persistent connection monitoring and containerized deployment. The platform is designed for extensibility, allowing for centralized configuration of multiple artificial intelligence model providers and logical versioning of distributed services.
- [chat-sdk/chat-sdk-ios](https://awesome-repositories.com/repository/chat-sdk-chat-sdk-ios.md) (920 ⭐) — Chat SDK iOS - Open Source Mobile Messenger
- [modular/modular](https://awesome-repositories.com/repository/modular-modular.md) (26,357 ⭐) — Modular is a unified machine learning development platform designed for building, compiling, and deploying high-performance neural network models. It provides a comprehensive execution engine that supports both local and production-grade inference, enabling developers to manage the entire model lifecycle from initial architecture definition to scalable, containerized service deployment.

The platform distinguishes itself through a hardware-agnostic runtime that abstracts diverse silicon architectures, allowing models to execute efficiently across varied compute environments. It includes a specialized stack for systems-level kernel programming, which provides direct memory control and low-level access to hardware primitives. This allows for the development of custom neural network operators and high-performance compute kernels, which are then integrated into optimized execution graphs through automated compilation and operator fusion.

Beyond core execution, the platform offers extensive tooling for performance engineering, including granular profiling instrumentation, hardware-specific bottleneck analysis, and automated benchmarking against defined datasets. It supports a wide range of generative AI tasks through a standardized, multi-modal interface that handles text, image, and video generation. The system also manages infrastructure requirements, including environment orchestration, dependency synchronization, and automated workload routing for high-throughput production clusters.
