# Graph RAG Systems

> Search results for `knowledge graph RAG over a corpus of documents` on awesome-repositories.com. 116 total matches; showing the first 50.

Explore on the web: https://awesome-repositories.com/q/knowledge-graph-rag-over-a-corpus-of-documents

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [this search on awesome-repositories.com](https://awesome-repositories.com/q/knowledge-graph-rag-over-a-corpus-of-documents).**

## Results

- [camel-ai/camel](https://awesome-repositories.com/repository/camel-ai-camel.md) (17,253 ⭐) — This project is a comprehensive framework for building and managing autonomous agent systems. It provides a unified architecture for orchestrating multi-agent societies, where specialized agents collaborate through roleplay to decompose and solve complex tasks. The system integrates language models with external environments, enabling agents to perform real-world actions through a standardized tool-calling abstraction layer.

The framework distinguishes itself through its focus on iterative reasoning and data reliability. It employs automated feedback loops to refine agent outputs and self-eva
- [mastra-ai/mastra](https://awesome-repositories.com/repository/mastra-ai-mastra.md) (21,221 ⭐) — Mastra is an orchestration framework designed for building, deploying, and managing autonomous AI agents and multi-agent systems. It provides a comprehensive suite of primitives for creating resilient AI applications, including durable workflow orchestration, event-driven agent loops, and semantic memory management. By integrating these core components, the platform enables developers to build complex, multi-step processes that can reason about goals and execute tasks without manual intervention.

The framework distinguishes itself through its focus on observability and secure, isolated execut
- [anthropics/claude-cookbooks](https://awesome-repositories.com/repository/anthropics-claude-cookbooks.md) (45,835 ⭐) — This repository serves as a comprehensive library of architectural blueprints and code examples for integrating large language models into software applications. It functions as a developer learning resource, providing structured tutorials and implementation patterns that demonstrate how to build intelligent features using advanced prompting and data processing techniques.

The collection distinguishes itself by focusing on complex reasoning and data-grounding workflows. It provides practical guidance on implementing retrieval-augmented generation pipelines, which connect language models to pr
- [aishwaryanr/awesome-generative-ai-guide](https://awesome-repositories.com/repository/aishwaryanr-awesome-generative-ai-guide.md) (24,755 ⭐) — This project is a community-driven knowledge repository and technical learning resource focused on the field of generative artificial intelligence. It serves as a centralized hub for developers and practitioners to access curated research, tutorials, and foundational concepts necessary for building and deploying modern artificial intelligence applications.

The platform distinguishes itself through a collaborative, distributed contribution model that aggregates diverse learning materials into a structured, searchable knowledge base. It covers a wide range of specialized topics, including retri
- [othmanadi/planning-with-files](https://awesome-repositories.com/repository/othmanadi-planning-with-files.md) (14,139 ⭐) — Planning with files is an enterprise knowledge graph platform designed to transform unstructured organizational data into a searchable, interconnected network. By utilizing a graph-based retrieval-augmented generation engine, the system grounds language model outputs in verified internal data, ensuring that responses are explainable, traceable, and free from hallucinations.

The platform distinguishes itself through a focus on data sovereignty and secure, private infrastructure deployment. It enables organizations to maintain full control over sensitive information by processing data locally o
- [infiniflow/ragflow](https://awesome-repositories.com/repository/infiniflow-ragflow.md) (82,922 ⭐) — This project is a comprehensive retrieval-augmented generation platform designed for building, managing, and deploying knowledge-based AI applications. It provides a unified environment for organizing datasets, configuring conversational chat assistants, and developing autonomous agents that execute multi-step reasoning workflows. By integrating document intelligence with advanced retrieval pipelines, the platform enables the creation of grounded, verifiable responses supported by traceable citations.

The platform distinguishes itself through deep document understanding and sophisticated know
- [rahulnyk/knowledge_graph](https://awesome-repositories.com/repository/rahulnyk-knowledge-graph.md) (2,978 ⭐) — This project is a tool for transforming unstructured text into semantic knowledge graphs. It uses local language models to extract entities and their relationships, converting text corpora into a structured network of linked concepts.

The system provides a web interface for interactive network visualization, allowing users to navigate the resulting nodes and edges. It includes a topology analysis tool that calculates node degrees and identifies community clusters to determine the visual size and color of graph elements.

Beyond visualization, the project enables graph-based information retrie
- [lihanghang/knowledge-graph](https://awesome-repositories.com/repository/lihanghang-knowledge-graph.md) (1,759 ⭐) — 自然语言处理、知识图谱、对话系统，大模型等技术研究与应用。
- [hbai-ltd/toonflow-app](https://awesome-repositories.com/repository/hbai-ltd-toonflow-app.md) (10,084 ⭐) — Toonflow-app is an agent orchestration platform designed to automate complex creative workflows and content production pipelines. It coordinates tiered hierarchies of agents to decompose tasks and transform scripts into storyboards and short-form comic videos.

The platform features a non-linear infinite canvas workflow editor for arranging scripts and assets, supporting parallel production and backtracking. It utilizes a dynamic prompt manager that externalizes agent behaviors into markdown files for real-time tuning and a vector-based memory store to maintain consistent session context throu
- [gocn/knowledge](https://awesome-repositories.com/repository/gocn-knowledge.md) (2,667 ⭐) — Go社区的知识图谱，Knowledge Graph
- [datahub-project/datahub](https://awesome-repositories.com/repository/datahub-project-datahub.md) (12,141 ⭐) — DataHub is a metadata management platform designed to unify technical, operational, and business context across diverse data ecosystems. By utilizing a graph-based metadata model and an event-driven ingestion architecture, it creates a centralized source of truth that maps complex data relationships, lineage, and ownership. This foundational framework enables organizations to maintain a synchronized view of their data landscape, supporting both human-led discovery and automated data operations.

The platform distinguishes itself through its focus on grounding artificial intelligence and autono
- [memgraph/memgraph](https://awesome-repositories.com/repository/memgraph-memgraph.md) (4,163 ⭐) — Memgraph is an in-memory, distributed graph database designed for high-performance labeled property graph management. It utilizes a Cypher query engine for declarative data retrieval and manipulation, providing a scalable knowledge graph backend that integrates vector search and graph traversals.

The system distinguishes itself as a real-time graph analytics platform, employing native C++ and CUDA implementations to execute complex network analysis and dynamic community detection on streaming data. It provides specialized support for AI integration, including GraphRAG capabilities, the constr
- [grafana/grafana](https://awesome-repositories.com/repository/grafana-grafana.md) (74,456 ⭐) — Grafana is an observability data platform designed to aggregate metrics, logs, and traces from diverse sources into a unified environment. It functions as a centralized interface for visualizing complex telemetry data, transforming raw streams into interactive dashboards that support real-time system health tracking and performance monitoring.

The platform distinguishes itself through a plugin-based modular architecture that integrates disparate databases, cloud services, and monitoring tools via a standardized data abstraction layer. This framework allows for the dynamic loading of external
- [microsoft/graphrag](https://awesome-repositories.com/repository/microsoft-graphrag.md) (33,792 ⭐) — GraphRAG is a data processing pipeline and retrieval engine designed to transform unstructured text into interconnected knowledge graphs. By utilizing language models to extract entities and relationships, it builds structured representations of information that enable context-aware retrieval for downstream applications.

The system distinguishes itself through hierarchical graph clustering and large-scale data synthesis, which organize massive document corpora into multi-level structures. This approach allows for both vector-based semantic searches and graph-based traversals, providing a comp
- [mrrezaeiuoft/amg-rag](https://awesome-repositories.com/repository/mrrezaeiuoft-amg-rag.md) (34 ⭐) — AMG-RAG (Agentic Medical Graph-RAG) is a comprehensive framework that automates the construction and continuous updating of Medical Knowledge Graphs (MKGs), integrates reasoning, and retrieves current external evidence for medical Question Answering (QA). Our approach addresses the challenge of…
- [openspg/kag](https://awesome-repositories.com/repository/openspg-kag.md) (8,548 ⭐) — KAG is a graph-augmented retrieval augmented generation system and knowledge graph engine. It functions as a framework that integrates large language models with graph retrieval and numerical calculation to resolve natural language queries.

The system creates unified knowledge representations by aligning unstructured data and expert rules through semantic mapping. It maintains mutual indexing between graph structures and original text blocks to ensure that reasoning processes remain linked to verifiable source data.

The project provides capabilities for semantic information integration, grap
- [spcl/graph-of-thoughts](https://awesome-repositories.com/repository/spcl-graph-of-thoughts.md) (2,805 ⭐) — This is the official implementation of Graph of Thoughts: Solving Elaborate Problems with Large Language Models. This framework gives you the ability to solve complex problems by modeling them as a Graph of Operations (GoO), which is automatically executed with a Large Language Model (LLM) as…
- [memorilabs/memori](https://awesome-repositories.com/repository/memorilabs-memori.md) (15,358 ⭐) — Memori is an AI agent memory middleware platform designed to provide persistent, context-aware recall for language models. It functions as a non-intrusive layer that intercepts outbound model requests to automatically capture interaction history and execution traces, ensuring that agents maintain continuity across sessions without requiring modifications to existing application logic.

The platform distinguishes itself through a dual-model storage architecture that maintains information as both structured relational primitives for precise fact retrieval and rolling narrative summaries for situ
- [docling-project/docling](https://awesome-repositories.com/repository/docling-project-docling.md) (61,674 ⭐) — Docling is a modular framework designed for document parsing, layout analysis, and structured data extraction. It transforms unstructured files and web content into a unified, hierarchical data model that preserves the spatial and semantic relationships between text, tables, images, and layout elements. By normalizing diverse input formats into a consistent internal representation, the library enables uniform processing across various document types.

The project distinguishes itself through a schema-driven approach that maps document regions to strongly-typed objects, ensuring data accuracy t
- [knowledgecanvas/knowledge](https://awesome-repositories.com/repository/knowledgecanvas-knowledge.md) (1,458 ⭐) — Knowledge is a tool for saving, searching, accessing, exploring and chatting with all of your favorite websites, documents and files.
- [xerrors/yuxi-know](https://awesome-repositories.com/repository/xerrors-yuxi-know.md) (4,354 ⭐) — Yuxi-Know is an LLM agent orchestration platform that coordinates multiple AI agents through graph-based workflows to decompose and execute complex reasoning tasks. It functions as a multi-tenant AI workspace with an agentic chat interface, combining retrieval-augmented generation with knowledge graph management for enterprise document processing and retrieval.

The platform distinguishes itself through graph-based agent orchestration, where directed acyclic graphs define execution dependencies between reasoning steps, enabling parallel or sequential task decomposition. It provides multi-tenan
- [crewaiinc/crewai](https://awesome-repositories.com/repository/crewaiinc-crewai.md) (53,687 ⭐) — CrewAI is a multi-agent orchestration framework designed for building autonomous systems that execute complex, multi-step workflows. It provides a development platform where specialized agents are defined with specific roles, goals, and tool sets to perform tasks collaboratively. By leveraging a declarative workflow engine, the system manages task dependencies, state transitions, and execution logic, allowing for the creation of structured, stateful sequences of operations.

The framework distinguishes itself through its hierarchical management capabilities, which utilize manager agents to coo
- [jm199504/financial-knowledge-graphs](https://awesome-repositories.com/repository/jm199504-financial-knowledge-graphs.md) (3,103 ⭐) — 小型金融知识图谱构建流程（neo4j / python / cypher / KG）
- [forem/forem](https://awesome-repositories.com/repository/forem-forem.md) (22,726 ⭐) — Forem is an open-source platform designed for building and managing technical communities. It functions as a social publishing engine that enables members to share long-form content, participate in threaded discussions, and engage through social interactions. The platform provides tools for organizations to maintain branded profiles, host community hackathons, and facilitate collaborative learning through structured educational tracks.

Beyond its social features, Forem integrates advanced capabilities for AI agent workflow orchestration and codebase knowledge graphing. It allows developers to
- [lemonhu/stock-knowledge-graph](https://awesome-repositories.com/repository/lemonhu-stock-knowledge-graph.md) (2,158 ⭐) — 利用网络上公开的数据构建一个小型的证券知识图谱/知识库
- [neo4j/neo4j](https://awesome-repositories.com/repository/neo4j-neo4j.md) (15,928 ⭐) — Neo4j is a native graph database management system designed to store and query highly connected data using a property-graph model. It provides an ACID-compliant transaction engine that ensures data integrity, supported by a distributed cluster architecture that maintains causal consistency across nodes. Users interact with the system through a declarative query language, which allows for complex pattern matching and path traversal without requiring manual traversal logic.

The platform distinguishes itself through its hybrid approach to data retrieval, combining traditional graph-based queries
- [labring/fastgpt](https://awesome-repositories.com/repository/labring-fastgpt.md) (27,132 ⭐) — FastGPT is a comprehensive platform for building, deploying, and managing context-aware artificial intelligence applications. It provides a unified environment that integrates custom data sources with language models, utilizing a retrieval-augmented generation engine to ground responses in accurate, domain-specific information. The system is designed for enterprise-scale use, featuring multi-tenant architecture, administrative controls, and secure authentication protocols including OAuth 2.0 and custom single sign-on integration.

The platform distinguishes itself through a visual, node-based
- [datawhalechina/all-in-rag](https://awesome-repositories.com/repository/datawhalechina-all-in-rag.md) (3,989 ⭐) — This project is a retrieval augmented generation framework designed to build pipelines that connect unstructured data and knowledge graphs with large language models. It functions as a vector database orchestrator for indexing text and multimodal content, as well as a system for translating natural language queries into structured database commands.

The framework integrates a hybrid retrieval engine that combines dense vector search with sparse keyword matching to increase the precision of retrieved contexts. It further enhances reasoning and relationship mapping through a graph-augmented ret
- [geeks-of-data/knowledge-gpt](https://awesome-repositories.com/repository/geeks-of-data-knowledge-gpt.md) (291 ⭐) — Extract knowledge from all information sources using gpt and other language models. Index and make Q&A session with information sources.
- [howl-anderson/tools_for_corpus_of_people_daily](https://awesome-repositories.com/repository/howl-anderson-tools-for-corpus-of-people-daily.md) (292 ⭐) — 人民日报语料处理工具集 | Tools for Corpus of People's Daily
- [dair-ai/prompt-engineering-guide](https://awesome-repositories.com/repository/dair-ai-prompt-engineering-guide.md) (75,678 ⭐) — This project is a comprehensive educational resource and technical guide focused on the development, optimization, and application of large language models. It provides a structured curriculum for mastering prompt engineering, ranging from foundational principles of instruction design to advanced techniques for improving model reasoning, accuracy, and reliability.

The guide distinguishes itself by offering deep technical insights into agentic workflows and autonomous system design. It covers the implementation of multi-step reasoning chains, tool integration through function calling, and stat
- [milvus-io/milvus](https://awesome-repositories.com/repository/milvus-io-milvus.md) (44,804 ⭐) — Milvus is a specialized vector database engine designed for the indexing, management, and high-speed similarity retrieval of high-dimensional vector embeddings. It functions as a similarity search engine capable of identifying nearest neighbors within large-scale vector spaces, supporting the storage and retrieval of billions of data points while maintaining consistent performance.

The system utilizes a distributed architecture that decouples storage, query, and coordination into independent services, allowing for horizontal scaling across clusters. It employs a global indexing mechanism that
- [datawhalechina/tiny-universe](https://awesome-repositories.com/repository/datawhalechina-tiny-universe.md) (4,505 ⭐) — Tiny Universe is an educational monorepo that delivers multiple independent implementations of core AI subsystems as self-contained Jupyter notebooks. It provides from-scratch constructions of foundational architectures including a complete Transformer model built from the original paper specification, a denoising diffusion probabilistic model for image generation, and a ReAct-style autonomous agent framework that equips an LLM with tools for planning and multi-step task execution.

The project distinguishes itself by covering the full lifecycle of modern AI systems through hands-on implementa
- [vitali87/code-graph-rag](https://awesome-repositories.com/repository/vitali87-code-graph-rag.md) (1,909 ⭐)
- [flowiseai/flowise](https://awesome-repositories.com/repository/flowiseai-flowise.md) (53,641 ⭐) — Flowise is a low-code platform designed for building and deploying complex language model workflows through a visual, node-based interface. It functions as an orchestrator for autonomous multi-agent systems, allowing users to construct conversational pipelines by connecting language models, memory stores, and external tools on a drag-and-drop canvas.

The platform distinguishes itself through its support for sophisticated agentic patterns, including supervisor-worker delegation and iterative reasoning strategies. Users can design directed acyclic graphs to manage conditional branching, state p
- [falkordb/falkordb](https://awesome-repositories.com/repository/falkordb-falkordb.md) (3,437 ⭐) — FalkorDB is a high-performance graph database management system and vector graph database. It serves as a knowledge graph construction tool and a GraphRAG knowledge store, integrating structured property graphs with vector search to provide grounded context for large language models. The engine is designed as a multi-tenant graph engine, capable of hosting thousands of isolated datasets within a single instance.

The system distinguishes itself by using linear algebra for query execution, treating relationship tensors as matrix multiplications to achieve low-latency multi-hop traversals. It ut
- [strongcourage/fuzzing-corpus](https://awesome-repositories.com/repository/strongcourage-fuzzing-corpus.md) (320 ⭐) — My fuzzing corpus
- [hkuds/lightrag](https://awesome-repositories.com/repository/hkuds-lightrag.md) (36,651 ⭐) — LightRAG is a graph-based retrieval framework designed to build retrieval-augmented generation pipelines. It structures unstructured text into knowledge graphs, enabling multi-hop reasoning and complex query synthesis across large document collections. By integrating dense vector embeddings with structured knowledge graphs, the system facilitates both similarity-based and relationship-aware information retrieval.

The framework distinguishes itself through a dual-level retrieval strategy that combines low-level keyword matching with high-level semantic graph traversal to capture both specific
- [jamiewilson/corpus](https://awesome-repositories.com/repository/jamiewilson-corpus.md) (427 ⭐) — Corpus is yet another CSS toolkit. It’s basically a collection of the things I find myself returning to for each new project. It uses Flexbox for the grid system, viewport-based heights and percentage-based widths, is heavily influenced by Basscss’s White Space module, and has a few useful…
- [agentscope-ai/agentscope](https://awesome-repositories.com/repository/agentscope-ai-agentscope.md) (26,895 ⭐) — Agentscope is a comprehensive toolkit for developing and orchestrating autonomous multi-agent systems. It provides a unified framework for building agents that can reason, execute tools, and manage memory, enabling the creation of complex, collaborative workflows where multiple specialized agents interact to solve multi-step objectives.

The platform distinguishes itself through a robust orchestration engine that supports both sequential and concurrent agent pipelines. It utilizes a centralized event bus for real-time telemetry, allowing developers to track agent reasoning, tool usage, and sys
- [khiajohnson/spice-corpus](https://awesome-repositories.com/repository/khiajohnson-spice-corpus.md) (40 ⭐) — An open-access corpus of conversational bilingual speech in Cantonese and English
- [huggingface/smolagents](https://awesome-repositories.com/repository/huggingface-smolagents.md) (27,885 ⭐) — This framework provides a development toolkit for building autonomous agents that utilize language models to solve complex, non-deterministic tasks. Its core design centers on a code-executing architecture where agents generate and run Python code snippets to perform logic, data manipulation, and tool interactions. By moving beyond structured data formats, the system enables agents to manage program flow and object state through iterative reasoning cycles.

The project distinguishes itself through its focus on code-based agent implementation and secure execution environments. Developers can ch
- [kestra-io/kestra](https://awesome-repositories.com/repository/kestra-io-kestra.md) (27,073 ⭐) — Kestra is a declarative workflow orchestrator designed to manage complex task dependencies and automated processes through versioned configuration files. It functions as a distributed platform that decouples task scheduling from execution by offloading computational workloads to a fleet of worker nodes. The system uses a reactive, event-driven engine to initiate workflows automatically in response to external signals, webhooks, schedules, or file system changes.

The platform distinguishes itself through a modular plugin architecture that allows for the integration of custom tasks and external
- [liuhuanyong/qasystemonmedicalkg](https://awesome-repositories.com/repository/liuhuanyong-qasystemonmedicalkg.md) (7,313 ⭐) — QASystemOnMedicalKG is a medical knowledge graph question answering system designed to retrieve disease-centered information from a structured data store. It functions as both a constructor for building medical knowledge graphs and a retrieval system that extracts answers regarding symptoms, causes, and treatments.

The system employs a pipeline that converts unstructured medical web data into a graph database using dictionary-based entity segmentation. It utilizes query-based intent classification to parse natural language inputs and maps these queries to specific nodes and edges within the g
- [oye93/chinese-nlp-corpus](https://awesome-repositories.com/repository/oye93-chinese-nlp-corpus.md) (922 ⭐) — Collections of Chinese NLP corpus
- [getzep/graphiti](https://awesome-repositories.com/repository/getzep-graphiti.md) (22,936 ⭐) — Graphiti is a backend framework and memory server designed to provide artificial intelligence agents with persistent, time-aware knowledge graph storage. It functions as a memory layer that enables agents to maintain context across long-term interactions by recording and evolving structured data over time.

The system distinguishes itself through a specialized temporal graph database that tracks how entities and relationships change using validity windows. By combining semantic vector similarity, keyword matching, and graph topology traversal, the engine performs hybrid retrieval to locate rel
- [cmavro/gnn-rag](https://awesome-repositories.com/repository/cmavro-gnn-rag.md) (435 ⭐) — This is the code for GNN-RAG: Graph Neural Retrieval for Large Language Modeling Reasoning.
- [mhagiwara/github-typo-corpus](https://awesome-repositories.com/repository/mhagiwara-github-typo-corpus.md) (519 ⭐) — GitHub Typo Corpus: A Large-Scale Multilingual Dataset of Misspellings and Grammatical Errors
- [arangodb/arangodb](https://awesome-repositories.com/repository/arangodb-arangodb.md) (14,091 ⭐) — This project is a multi-model database system designed to store and manage information as documents, graphs, and key-value pairs within a single engine. It functions as a graph database and knowledge graph platform, providing the infrastructure to build, query, and visualize structured data models. By integrating vector search capabilities, the system serves as a vector database that supports retrieval-augmented generation for artificial intelligence applications.

The platform distinguishes itself through a unified query language that allows users to perform document lookups, graph traversals
- [langbot-app/langbot](https://awesome-repositories.com/repository/langbot-app-langbot.md) (15,311 ⭐) — LangBot is an orchestration platform designed for building, managing, and deploying AI agents. It functions as a comprehensive framework for integrating large language models with custom workflows, enabling developers to connect intelligent agents to various messaging platforms and external tools.

The platform distinguishes itself through a modular, plugin-based architecture that allows for the extension of agent capabilities via custom tools and file parsers. It features a secure, sandbox-isolated runtime environment that executes untrusted code and plugin logic within resource-constrained c
