# apache/dolphinscheduler

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/apache-dolphinscheduler).**

14,311 stars · 5,047 forks · Java · Apache-2.0

## Links

- GitHub: https://github.com/apache/dolphinscheduler
- Homepage: https://dolphinscheduler.apache.org/
- awesome-repositories: https://awesome-repositories.com/repository/apache-dolphinscheduler.md

## Topics

`airflow` `azkaban` `cloud-native` `data-pipelines` `job-scheduler` `orchestration` `powerful-data-pipelines` `task-scheduler` `workflow` `workflow-orchestration` `workflow-schedule`

## Description

DolphinScheduler is a distributed workflow orchestrator designed to manage and automate complex data processing pipelines. It functions as a data pipeline scheduler that coordinates multi-step tasks across distributed environments, ensuring reliable execution through defined dependencies and sequences.

The platform utilizes a directed acyclic graph model to represent workflows, allowing users to define task relationships via a visual interface. It employs a master-worker architecture supported by a pluggable task plugin system, which enables the dynamic extension of task types without requiring modifications to the core codebase.

The system provides comprehensive monitoring and observability tools to track the status and performance of distributed tasks in real-time. By integrating automated scheduling and recurring task management, it facilitates the coordination of large-scale data processing jobs across diverse infrastructure components.

## Tags

### Data & Databases

- [Distributed Task Schedulers](https://awesome-repositories.com/f/data-databases/distributed-task-schedulers.md) — Provides a platform for defining, scheduling, and monitoring complex data processing pipelines across distributed environments.
- [Data Pipeline Orchestration](https://awesome-repositories.com/f/data-databases/data-pipeline-orchestration.md) — Manages complex multi-step data processing tasks across distributed systems via a visual interface.
- [Workflow Orchestrators](https://awesome-repositories.com/f/data-databases/workflow-orchestrators.md) — Orchestrates complex, multi-step data processing workflows across distributed environments using dependency-based scheduling. ([source](https://dolphinscheduler.apache.org/))
- [Data Processing Workflows](https://awesome-repositories.com/f/data-databases/data-processing-workflows.md) — Coordinates large-scale data processing jobs across diverse infrastructure to ensure reliable data movement.
- [State Persistence](https://awesome-repositories.com/f/data-databases/state-persistence.md) — Stores workflow metadata and execution logs in a relational database to ensure system recovery and task continuity.

### Business & Productivity Software

- [Task Schedulers](https://awesome-repositories.com/f/business-productivity-software/task-schedulers.md) — Automates the execution of recurring data pipelines on fixed timetables or specific triggers. ([source](https://dolphinscheduler.apache.org/))

### Development Tools & Productivity

- [Task Scheduling](https://awesome-repositories.com/f/development-tools-productivity/task-scheduling.md) — Automates recurring data pipeline execution to ensure consistent processing without manual intervention.

### Programming Languages & Runtimes

- [Directed Acyclic Graph Execution Engines](https://awesome-repositories.com/f/programming-languages-runtimes/runtime-execution-environments/runtime-environments/runtimes/graph-symbolic-execution-engines/directed-acyclic-graph-execution-engines.md) — Represents workflows as directed acyclic graphs to define task dependencies and ensure correct execution order.

### System Administration & Monitoring

- [Task Monitoring](https://awesome-repositories.com/f/system-administration-monitoring/task-monitoring.md) — Provides real-time observability into the execution status and performance of multi-step workflows.
- [Workflow Monitoring Systems](https://awesome-repositories.com/f/system-administration-monitoring/workflow-monitoring-systems.md) — Provides real-time observability into the status and performance of distributed tasks within complex pipelines. ([source](https://dolphinscheduler.apache.org/))

### Part of an Awesome List

- [General Purpose Orchestration](https://awesome-repositories.com/f/awesome-lists/devtools/general-purpose-orchestration.md) — Distributed, extensible workflow scheduler with visual DAG management.

### DevOps & Infrastructure

- [Worker Node Management](https://awesome-repositories.com/f/devops-infrastructure/worker-node-management.md) — Manages distributed worker nodes through a central master node for task execution across a cluster.

### Networking & Communication

- [Distributed Coordination Services](https://awesome-repositories.com/f/networking-communication/distributed-systems-p2p/distributed-systems-coordination/distributed-systems-infrastructure/distributed-coordination-services.md) — Provides consensus-based coordination for leader election, service discovery, and health monitoring in distributed clusters.

### Software Engineering & Architecture

- [Workflow Monitoring](https://awesome-repositories.com/f/software-engineering-architecture/workflow-monitoring.md) — Tracks the lifecycle and execution progress of distributed tasks to identify bottlenecks in real-time.
- [Plugin-Based Architectures](https://awesome-repositories.com/f/software-engineering-architecture/software-architecture/architectural-patterns/plugin-module-systems/modular-plugin-architectures/plugin-based-architectures.md) — Implements a pluggable architecture that allows dynamic loading of task modules without modifying core code.
