59 Repos
Frameworks for constructing and managing directed acyclic graphs for task execution.
Distinguishing note: Focuses on graph construction for accelerated distributed execution.
Explore 59 awesome GitHub repositories matching software engineering & architecture · Execution Graphs. Refine with filters or upvote what's useful.
ComfyUI is a modular generative AI workflow orchestrator and node-based GUI for designing and executing complex diffusion model pipelines. It functions as both a visual interface for building generative logic graphs and a programmable backend API that exposes diffusion model operations for external integration. The system distinguishes itself through a graph-based execution model that supports differential workflow execution, re-running only modified nodes to reduce computation. It features dynamic model offloading to manage memory between system RAM and GPU VRAM and utilizes metadata-embedde
Uses a node-based graph to orchestrate and execute complex generative AI tasks.
Ray is a distributed computing framework designed to scale Python and Java applications across clusters by abstracting task scheduling and resource management. It functions as a resource-aware execution engine that manages task dependencies, placement, and fault tolerance across networked compute nodes. At its core, the system provides a stateful actor model, allowing developers to define classes that run in dedicated processes to maintain and mutate internal state across remote method calls. The framework distinguishes itself through a robust cross-language interoperability layer, enabling f
Supports binding actor methods and configuring transport settings to prepare complex task chains.
MediaPipe is a cross-platform machine learning framework designed for building and deploying pipelines that process live and streaming media. It provides a system for connecting processing components into custom machine learning chains to analyze real-time audio and video streams. The framework includes a suite of pre-trained models for tasks such as hand, face, and pose tracking, along with tools for retraining and customizing these models with specific datasets. It also features a dedicated benchmarker for measuring the execution speed and accuracy of machine learning models directly within
Processes streaming data by passing packets through a directed graph of modular calculators.
Gulp is a JavaScript build automation tool and Node.js task runner designed to coordinate the execution of development tasks. It operates as a streaming build system that transforms source files into production assets through a series of programmable pipelines. The system functions as a file watcher and task orchestrator, monitoring directories for modifications to automatically trigger specific build tasks. It organizes these units of work into sequential or parallel execution paths to streamline development workflows. The toolkit manages asset pipelines by reading files via pattern matchin
Implements an execution graph to organize build units into sequential or parallel paths.
This project is a machine learning array framework and tensor computation library designed for high-performance numerical computing. It provides a comprehensive suite of tools for constructing and training neural networks, featuring an automatic differentiation engine that facilitates gradient-based optimization and complex mathematical modeling. The library distinguishes itself through a unified memory architecture that allows data to be shared across CPU and GPU devices without explicit copies, significantly reducing data movement overhead. Its execution model relies on a lazy evaluation en
Triggers the execution of accumulated compute graphs at specific intervals to balance processing overhead against the benefits of batching.
Apache Flink is a distributed processing engine designed for both high-throughput, low-latency data streams and finite batch workloads. It functions as a stateful stream processor and a SQL stream processing engine, providing a unified runtime to execute relational queries and event-based transformations. The system is distinguished by its ability to manage persistent operator state to ensure exactly-once processing guarantees and consistency during failures. It features specialized capabilities for complex event processing to detect temporal patterns and handles out-of-order events using eve
Translates relational queries into optimized physical execution graphs of streaming or batch operators.
Deepagents is an LLM agent orchestration platform and stateful application server designed for deploying and managing AI agents built with computational graphs. It provides a containerized runtime environment that handles agent execution, state persistence, and the versioning of AI assistants. The platform distinguishes itself through deep integration with the Model Context Protocol, allowing agents to function as servers that expose tools and capabilities to external clients. It features a sophisticated observability suite for capturing execution traces, performing LLM-based evaluations agai
Triggers one-off agent tasks in the background that bypass state persistence for lower latency.
Graal is a compiler and runtime architecture designed for high-performance execution and polyglot interoperability. It utilizes a graph-based representation of program logic to perform global optimizations and JIT compilation. The project features a meta-circular interpretation framework and a specialized partial evaluation mechanism, which allow for the creation of new programming languages and the automatic optimization of their semantics into machine code. It enables multiple diverse programming languages to share memory and communicate through a standardized cross-language protocol within
Implements a graph-based representation of program logic to perform global optimizations before emitting final machine code.
MXNet is a deep learning framework and distributed machine learning engine designed for training and deploying neural networks. It functions as a hardware-agnostic backend that allows for the development of deep learning models through a hybrid of symbolic and imperative programming. The system distinguishes itself through automatic distributed parallelism, which scales training workloads across multiple GPUs and machines. It features an extensible hardware backend interface that enables the integration of custom accelerators and proprietary libraries without modifying the core source code.
Schedules operations by tracking dependencies in a dynamic graph to execute nodes immediately as data dependencies are met.
This project is a deep learning framework designed for constructing, training, and deploying neural networks across diverse hardware environments. It functions as a high-performance tensor computation library that provides both imperative and symbolic programming interfaces, allowing developers to balance flexible, step-by-step model building with the efficiency of compiled computation graphs. The framework distinguishes itself through a hybrid execution engine that integrates declarative graph compilation with imperative runtime logic. It supports scalable, distributed training across multip
Integrates custom operators and hardware-specific optimizations into computation graphs to accelerate model performance.
LangChain.js is a framework for building, executing, and monitoring stateful agentic applications. It provides an orchestration engine that models workflows as directed graphs, allowing developers to connect language models, data sources, and external tools into modular, multi-step processes. The platform distinguishes itself through its focus on stateful execution and human-in-the-loop control. It manages agent lifecycles by persisting execution state across threads, enabling fault tolerance and the ability to pause workflows at designated breakpoints for manual review or modification. This
Supports stateless graph execution for low-latency, incremental streaming of model responses.
AISystem is a comprehensive AI full-stack infrastructure project covering the entire pipeline from AI chip architecture to high-level training frameworks. It encompasses the development of AI compiler frameworks, inference engines, and distributed training orchestrators designed to coordinate workloads across a heterogeneous compute stack of CPUs, GPUs, and NPUs. The project focuses on the deep integration of software and hardware, employing software-hardware co-design to align tensor layouts with physical memory structures. It provides specialized capabilities for accelerating Transformer mo
Transforms compute graphs through operator fusion and layout conversion to maximize hardware utilization.
RagaAI-Catalyst is a suite of software implementation tools providing an SDK, dashboard, and platform for monitoring, debugging, red-teaming, and evaluating agentic AI workflows. It serves as an observability framework for tracing the execution paths of large language models and multi-agent systems. The project distinguishes itself through a security suite for automated red-teaming and vulnerability scanning to detect biases, alongside a centralized prompt registry that decouples templates from application code. It further provides an evaluation platform that combines synthetic data generatio
Captures sequential tool calls and model interactions to visualize the logic flow as a directed graph.
Dagger is a programmable CI/CD engine and containerized task runner designed to orchestrate build and test pipelines. It functions as an incremental build system that manages containers, filesystems, and secrets through a typed API to ensure consistent execution across local and cloud environments. The engine utilizes a language-agnostic client-server API to allow multi-language pipeline orchestration, enabling the sharing of typed artifacts and state across different SDKs without manual serialization. It optimizes execution through content-addressable caching and a directed acyclic graph to
Uses a directed acyclic graph to model pipeline dependencies and enable parallel task execution.
Ivy is a machine learning framework transpiler and model converter designed to ensure deep learning portability. It serves as a tool for migrating source code and models between different deep learning frameworks while maintaining original functionality. The system enables cross-framework model portability by translating model weights, architectures, and source code. It uses abstract syntax tree based transpilation and computational graph tracing to capture execution flows and rewrite high-level logic into target framework code. The project covers model interoperability through weight-layout
Captures execution flows as graphs to analyze or execute operations within a target framework.
ai-edu is a comprehensive AI education curriculum and machine learning courseware collection. It provides theoretical tutorials, deep learning lab exercises, and project blueprints designed to teach artificial intelligence fundamentals through a combination of study and practical implementation. The project focuses on a learning-by-doing approach, guiding users from Python programming and neural network basics to advanced topics. It includes specialized instructional content on distributed AI training, MLOps educational guides for model quantization and pruning, and detailed frameworks for im
Provides instructional content on compiling neural network graphs to optimize GPU execution and reduce latency.
Dask ist ein Framework für paralleles Rechnen und ein verteilter Task-Scheduler, der darauf ausgelegt ist, Python-Data-Science-Workflows von einzelnen Maschinen auf große Cluster zu skalieren. Es fungiert als Cluster-Ressourcenmanager, der die Berechnungslogik orchestriert, indem Aufgaben und deren Abhängigkeiten als gerichtete azyklische Graphen dargestellt werden. Diese Architektur ermöglicht es dem System, die Verteilung von Workloads auf verfügbare Hardware zu automatisieren und gleichzeitig komplexe Ausführungsanforderungen zu verwalten. Das Projekt zeichnet sich durch eine Lazy-Evaluation-Engine aus, die Datenoperationen verzögert, bis sie explizit angefordert werden, was eine globale Graphoptimierung und effiziente Ressourcenzuweisung ermöglicht. Es integriert speicherbewusstes Data-Spilling, um Systemabstürze bei der Verarbeitung von Datensätzen zu verhindern, die den verfügbaren Speicher überschreiten, und nutzt Task-Graph-Fusion, um Sequenzen von Operationen in einzelne Ausführungsschritte zu kombinieren, wodurch Scheduling-Overhead und Inter-Node-Kommunikation minimiert werden. Die Plattform bietet eine umfassende Oberfläche für die Datenanalyse im großen Maßstab, einschließlich Unterstützung für verteiltes maschinelles Lernen, Integration in das Hochleistungsrechnen und parallele Datenverarbeitung. Sie bietet umfangreiche Werkzeuge für das Cluster-Lebenszyklusmanagement, Performance-Profiling und die Echtzeitüberwachung der Aufgabenausführung. Benutzer können diese Umgebungen über verschiedene Infrastrukturen hinweg bereitstellen, einschließlich lokaler Hardware, Cloud-Anbietern, containerisierten Systemen und Hochleistungsrechner-Clustern.
Orchestrates parallel execution of arbitrary task dependencies by defining and processing directed acyclic graphs.
Earthly is a containerized build system and Docker build framework designed for creating reproducible build pipelines. It ensures environment consistency by executing every build step inside an isolated container, combining the isolation of container images with dependency tracking and parallel execution. The system differentiates itself through a focus on hermeticity and multiplatform support, allowing for the generation of container images targeting multiple CPU architectures within a single execution flow. It maintains a hermetic build environment by isolating network access and utilizing
Models build targets as a directed acyclic graph to optimize parallel execution and result reuse.
Cpp-taskflow is a C++ task-parallelism framework and task graph scheduler designed to manage and execute complex dependency graphs of parallel tasks across CPU and GPU hardware. It provides a parallel algorithm library for high-performance implementations of reductions, sorts, pipelines, and iterations. The framework distinguishes itself through its ability to offload heavy computational workloads from a task graph to graphics processors for acceleration. It also includes a task profiling tool and a performance analysis interface for visualizing task execution flow and dependency structures t
Provides a framework for constructing and managing directed acyclic graphs to execute interdependent tasks in parallel.
Taskflow is a C++ task-parallel framework designed to build high-performance parallel workflows and complex dependency graphs. It provides a programming model that organizes computational work into directed acyclic graphs, enabling developers to manage concurrency, resource scheduling, and task dependencies across multi-core CPUs and GPU accelerators. The framework distinguishes itself through its ability to orchestrate heterogeneous systems, allowing for the integration of hardware-accelerated kernels and memory operations into unified execution pipelines. It supports dynamic runtime subflow
Clears the internal state of a captured graph execution or base graph object to prepare it for new operations.