38 مستودعات
Databases designed to store and query data as nodes and edges.
Distinguishing note: Focuses on graph-based relationship management and traversal.
Explore 38 awesome GitHub repositories matching data & databases · Graph Databases. Refine with filters or upvote what's useful.
Embedchain is an LLM memory management framework and RAG orchestration engine designed to provide AI agents with a persistent storage layer. It functions as a long-term memory pipeline that extracts facts from unstructured interactions and stores them as permanent knowledge base entries to retain user preferences and interaction history across sessions. The system employs a hybrid vector database interface that combines semantic embeddings with traditional keyword search. It utilizes an entity-linking knowledge graph to connect related information points and applies temporal ranking to distin
Utilizes temporal tracking to rank retrieved data and distinguish current states from past events.
This project is a comprehensive reference collection of practical implementation examples and patterns for building applications with Spring Boot. It serves as a Java web application template and a showcase for developing functional web services featuring REST endpoints, template engines, and global exception handling. The repository distinguishes itself by providing detailed demonstrations of enterprise-grade features, including distributed locking, task scheduling, and asynchronous message exchange using brokers like RabbitMQ. It also includes reference implementations for automated API doc
Provides integration to store and query complex relationship networks using graph databases.
SurrealDB is a multi-model database engine designed to store and query document, graph, relational, and vector data within a single ACID-compliant platform. It functions as an AI-native data store, integrating vector search, graph traversal, and machine learning model execution directly into its query layer. By providing a unified declarative query language, the platform eliminates the need for external middleware to synchronize data across different storage models. The platform distinguishes itself through its ability to manage agent memory and complex workflows natively. It allows developer
Handles continuously updated graph data and multi-model workloads at scale.
This project is a software engineering educational resource providing a collection of canonical system implementations. It serves as a library of computer science case studies and polyglot code examples designed to demonstrate architectural tradeoffs and design patterns through concise versions of fundamental software components. The repository focuses on studying the implementation of core concepts such as consensus algorithms, interpreters, and database engines. It provides minimal versions of complex systems to facilitate the analysis of language design, data structure implementation, and
Implements a minimal in-memory graph database for educational study of relationship-based data structures.
Graphiti is a backend framework and memory server designed to provide artificial intelligence agents with persistent, time-aware knowledge graph storage. It functions as a memory layer that enables agents to maintain context across long-term interactions by recording and evolving structured data over time. The system distinguishes itself through a specialized temporal graph database that tracks how entities and relationships change using validity windows. By combining semantic vector similarity, keyword matching, and graph topology traversal, the engine performs hybrid retrieval to locate rel
Ships a specialized storage engine that manages structured information with validity windows to support real-time updates and historical retrieval.
Dgraph is a distributed graph database designed to store and query highly connected data. It organizes information as nodes and edges to represent complex relationships between entities, providing a platform for managing and analyzing deeply linked datasets. The system functions as a horizontally scalable cluster that partitions data across multiple nodes to maintain performance and availability as information volume increases. It utilizes a specialized query language built for low-latency navigation of interconnected data points, allowing for the execution of complex queries across large-sca
Functions as a horizontally scalable distributed graph database for storing and querying complex relationships.
Pentagi is an autonomous security testing framework and agent orchestrator designed to plan and execute end-to-end security assessments. It utilizes a coordination engine to decompose complex goals into actionable subtasks, performing automated penetration testing and vulnerability research within isolated container environments. The system distinguishes itself through a temporal knowledge graph that tracks semantic relationships between entities and vulnerabilities to reuse intelligence across projects. It includes a web intelligence reconnaissance tool for automated data gathering and agent
Implements a temporal graph database to track semantic relationships between entities and vulnerabilities over time.
DataX is a distributed data integration framework and plugin-based ETL tool designed for synchronizing large datasets between heterogeneous sources and destinations. It functions as a JDBC data migration engine and offline synchronization tool, enabling the movement of data between relational databases, NoSQL stores, and object storage. The system utilizes a plugin-based connector architecture that decouples reader and writer logic, allowing it to map and transform data types across different storage engines using a standardized internal representation. This design supports heterogeneous data
Exports data into graph databases by converting source records into vertices and edges.
This project is a comprehensive framework for building and managing autonomous agent systems. It provides a unified architecture for orchestrating multi-agent societies, where specialized agents collaborate through roleplay to decompose and solve complex tasks. The system integrates language models with external environments, enabling agents to perform real-world actions through a standardized tool-calling abstraction layer. The framework distinguishes itself through its focus on iterative reasoning and data reliability. It employs automated feedback loops to refine agent outputs and self-eva
Provides graph database backends to support knowledge-based reasoning and analytics.
Beads is a versioned, dependency-aware graph database designed for distributed issue tracking and project management. It functions as an agentic workflow orchestrator, providing a structured environment where tasks, dependencies, and project metadata are linked through relational hierarchies. By maintaining a persistent, version-controlled record of project state, the system enables teams to manage complex work items across multiple repositories and environments. The platform distinguishes itself through its deep integration with automated coding agents, acting as a Model Context Protocol ser
Acts as a versioned, dependency-aware graph database for tracking project tasks and complex workflows.
Sourcetrail is an interactive source code explorer and visualizer designed for indexing and navigating relationships between symbols and structures across large, multi-language codebases. It functions as a static analysis indexer and code dependency visualizer that maps calls and dependencies between source files to help reveal project architecture. The tool enables multi-language project analysis by using a language-agnostic indexing system to track symbols across different programming languages within a single interface. It allows for the discovery of software architecture and the explorati
Stores parsed symbols and their connections in a relational database to enable complex architectural queries.
Cayley is a graph database engine designed for storing and querying interconnected data using a quad-based data model. It functions as an RDF quad store, managing information through subjects, predicates, objects, and labels. The system features a modular graph store architecture with pluggable backends, allowing it to swap between in-memory storage and various external persistent databases. It includes a GraphQL-inspired API and a dedicated data visualizer for the interactive exploration of nodes and edges. Query capabilities cover bidirectional path traversal and multi-syntax execution usi
Provides a complete graph database engine for storing and querying interconnected data using nodes and edges.
Cayley is a graph database and query engine designed to store and retrieve interconnected data. It functions as a quad store, persisting information as four-element tuples to maintain complex relationships and semantic linked data. The system features a backend-agnostic storage layer that decouples the graph API from the underlying data store. This allows for the integration of external backends through a modular adapter system, enabling the synchronization of data across different storage engines. The project provides a pattern-matching query engine for extracting specific nodes and relatio
Functions as a complete graph database system for storing and querying highly connected data.
Amass is a network attack surface mapper and reconnaissance framework designed to discover and map the external, internet-facing infrastructure of a target organization. It functions as an open source intelligence tool that identifies public network boundaries and locates hidden or forgotten subdomains to define an organization's total reachable footprint. The project utilizes passive-source data aggregation from external APIs and public databases alongside active DNS brute-forcing and recursive subdomain expansion. It employs a graph-based asset mapping system to visualize the relationships
Utilizes a graph database to map and visualize the relationships between discovered domains and IP addresses.
Amass is an attack surface management tool designed to identify, map, and inventory an organization's internet-facing digital assets. It functions as a security asset discovery engine that systematically expands an organization's known infrastructure footprint through recursive domain name resolution and the collection of intelligence from diverse public data sources. The platform distinguishes itself by utilizing a graph-based modeling approach to organize discovered resources. By maintaining a persistent graph database, it tracks the relationships between infrastructure components and norma
Maintains a persistent graph database to track historical states and relationships of discovered assets.
Planning with files is an enterprise knowledge graph platform designed to transform unstructured organizational data into a searchable, interconnected network. By utilizing a graph-based retrieval-augmented generation engine, the system grounds language model outputs in verified internal data, ensuring that responses are explainable, traceable, and free from hallucinations. The platform distinguishes itself through a focus on data sovereignty and secure, private infrastructure deployment. It enables organizations to maintain full control over sensitive information by processing data locally o
Utilizes graph databases to provide context and relationships for models, reducing hallucinations and ensuring verifiable intelligence.
This project is a multi-model database system designed to store and manage information as documents, graphs, and key-value pairs within a single engine. It functions as a graph database and knowledge graph platform, providing the infrastructure to build, query, and visualize structured data models. By integrating vector search capabilities, the system serves as a vector database that supports retrieval-augmented generation for artificial intelligence applications. The platform distinguishes itself through a unified query language that allows users to perform document lookups, graph traversals
Stores and traverses complex relationships between data points using native graph structures and algorithms.
Unstructured is an enterprise-grade data orchestration engine designed to transform raw, unstructured files into structured, machine-readable formats. It functions as a comprehensive platform for document ingestion, partitioning, and enrichment, specifically engineered to prepare complex data for retrieval-augmented generation and agentic AI workflows. The platform distinguishes itself through its sophisticated document processing strategies, which combine rule-based extraction with vision-language models to handle diverse file layouts, tables, and images. It provides a modular architecture t
Writes processed document elements and their chunking relationships into graph databases to enable complex relationship queries.
Nebula is a distributed graph database designed for storing and querying massive volumes of interconnected vertices and edges across a horizontally scalable cluster. It functions as a Kubernetes-native database and a distributed graph analytics engine, utilizing a Raft-based distributed store to ensure strong consistency and high availability. The system features an OpenCypher query engine for performing complex graph traversals and pattern matching. It distinguishes itself with a decoupled compute-storage architecture and a shared-nothing distributed design, allowing query processing and dat
Provides a distributed graph database designed to store and query massive volumes of interconnected data.
Bloodhound is an Active Directory attack path mapper and security auditor designed to visualize trust relationships and permission chains. It serves as an attack surface management tool that identifies paths to domain administrator and other high-privileged accounts. The project uses a graph database analyzer to map complex identity and access relationships. It quantifies the risk of privilege escalation by identifying misconfigured permissions and trust links within Windows domains. The system provides capabilities for Active Directory security analysis, identity and access auditing, and ne
Leverages a graph database to store and analyze highly connected identity and access relationships.