awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Distributed Processing Engines · Awesome GitHub Repositories

1 repo

Awesome GitHub RepositoriesDistributed Processing Engines

Systems designed to distribute and manage large-scale analytical queries and batch transformation jobs across computing clusters.

Distinguishing note: Focuses on the submission and management of remote batch jobs rather than local data processing.

Explore 1 awesome GitHub repository matching data & databases · Distributed Processing Engines. Refine with filters or upvote what's useful.

  1. Home
  2. Data & Databases
  3. Distributed Processing Engines

Awesome Distributed Processing Engines GitHub Repositories

Describe the repository you're looking for…
Find the best repos with AI.We'll search the best matching repositories with AI.
  • apache/airflow

    apache/airflow

    44,326View on GitHub↗

    Airflow is a platform for programmatically authoring, scheduling, and monitoring complex data pipelines. It functions as a workflow automation engine that manages the lifecycle of recurring business processes by executing code-defined task dependencies. By representing workflows as directed acyclic graphs, the system ensures that task execution order and data flow are explicitly defined and reliably maintained across distributed computing environments. The platform distinguishes itself through a highly modular, provider-based architecture that decouples core orchestration logic from external

    Submit and manage analytical queries and batch transformation jobs on remote clusters to handle large-scale data workloads efficiently and reliably.

    Pythonairflowapacheapache-airflow
    44,326View on GitHub↗