What are the best open-source alternatives to DataFlow?

30 open-source projects similar to opendcai/dataflow, ranked by shared features. Top picks: camel-ai/camel, apache/nifi, vibrantlabsai/ragas, steveyegge/beads, business-science/ai-data-science-team, orchest/orchest, maiot-io/zenml, openbmb/chatdev, esbatmop/mnbvc, datajuicer/data-juicer.

Is camel-ai/camel a good alternative to DataFlow?

This project is a comprehensive framework for building and managing autonomous agent systems. It provides a unified architecture for orchestrating multi-agent societies, where specialized agents collaborate through roleplay to decompose and solve complex tasks. The system integrates language models…

Is apache/nifi a good alternative to DataFlow?

Apache NiFi is a flow-based programming platform that enables the visual design, monitoring, and management of data pipelines. At its core, it provides a web-based visual dataflow designer where users build directed graphs of processors to route, transform, and mediate data movement between any sou…

Is vibrantlabsai/ragas a good alternative to DataFlow?

Ragas is an evaluation framework designed to measure the performance of retrieval-augmented generation pipelines and autonomous agent workflows. It provides a comprehensive suite of tools for benchmarking system outputs, utilizing language models as automated judges to score performance against def…

Is steveyegge/beads a good alternative to DataFlow?

Beads is a versioned, dependency-aware graph database designed for distributed issue tracking and project management. It functions as an agentic workflow orchestrator, providing a structured environment where tasks, dependencies, and project metadata are linked through relational hierarchies. By ma…

Is business-science/ai-data-science-team a good alternative to DataFlow?

This project is a platform that orchestrates multiple AI agents to automate data science workflows—covering data loading, cleaning, feature engineering, modeling, and querying. It also functions as a natural language database query interface, converting plain English questions into SQL, and as a vi…

Is orchest/orchest a good alternative to DataFlow?

Orchest is a data pipeline orchestrator and containerized workflow manager. It provides a platform for designing, scheduling, and executing complex data processing sequences through a combination of a graphical interface and scripting. The platform distinguishes itself by using containers to manag…

Is maiot-io/zenml a good alternative to DataFlow?

ZenML is an extensible machine learning orchestration framework designed to manage the end-to-end lifecycle of data pipelines and AI agent workflows. It functions as a durable orchestrator that executes machine learning tasks as directed acyclic graphs, ensuring that every step is containerized for…

Is openbmb/chatdev a good alternative to DataFlow?

ChatDev is an automated software engineering platform that orchestrates the end-to-end development lifecycle through a multi-agent framework. It functions as a programmable engine that coordinates specialized autonomous agents to handle design, coding, testing, and documentation tasks by transition…

Is esbatmop/mnbvc a good alternative to DataFlow?

MNBVC is a dataset pipeline and toolkit designed for the collection, cleaning, and normalization of massive text and code corpora used to train large language models. It provides specialized tools for harvesting source code, commit histories, and repository metadata from version control platforms,…

Is datajuicer/data-juicer a good alternative to DataFlow?

Data-Juicer is an open-source framework for cleaning, filtering, deduplicating, and transforming multimodal datasets to prepare them for training large language and vision models. It functions as a distributed data pipeline engine that runs processing jobs across Ray clusters, handling billions of…

Back to opendcai/dataflow

Open-source alternatives to DataFlow

30 open-source projects similar to opendcai/dataflow, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best DataFlow alternative.

camel-ai/camel
camel-ai/camel
17,253View on GitHub
This project is a comprehensive framework for building and managing autonomous agent systems. It provides a unified architecture for orchestrating multi-agent societies, where specialized agents collaborate through roleplay to decompose and solve complex tasks. The system integrates language models with external environments, enabling agents to perform real-world actions through a standardized tool-calling abstraction layer. The framework distinguishes itself through its focus on iterative reasoning and data reliability. It employs automated feedback loops to refine agent outputs and self-eva
Pythonagentai-societiesartificial-intelligence
View on GitHub17,253
apache/nifi
apache/nifi
5,976View on GitHub
Apache NiFi is a flow-based programming platform that enables the visual design, monitoring, and management of data pipelines. At its core, it provides a web-based visual dataflow designer where users build directed graphs of processors to route, transform, and mediate data movement between any source and destination without writing custom code. The system records fine-grained data provenance for every data item from ingestion to delivery, supporting audit, debugging, and replay of data lineage. The platform distinguishes itself through a zero-master cluster architecture that distributes proc
Javaapachehacktoberfestjava
View on GitHub5,976
vibrantlabsai/ragas
vibrantlabsai/ragas
12,659View on GitHub
Ragas is an evaluation framework designed to measure the performance of retrieval-augmented generation pipelines and autonomous agent workflows. It provides a comprehensive suite of tools for benchmarking system outputs, utilizing language models as automated judges to score performance against defined rubrics and reference data. By standardizing inputs, retrieved contexts, and generated responses into a unified schema, the project enables consistent analysis across complex AI applications. The framework distinguishes itself through its ability to generate synthetic test datasets from existin
Pythonevaluationllmllmops
View on GitHub12,659

Open-source alternatives to DataFlow

camel-ai/camel

apache/nifi

vibrantlabsai/ragas

steveyegge/beads

business-science/ai-data-science-team

orchest/orchest

maiot-io/zenml

OpenBMB/ChatDev

esbatmop/MNBVC

datajuicer/data-juicer

OpenBMB/UltraRAG

dbt-labs/dbt-core

microsoft/vscode-copilot-chat

DLLXW/baby-llama2-chinese

SylphAI-Inc/AdalFlow

Kilo-Org/kilocode

dusty-nv/jetson-inference

Cinnamon/kotaemon

meta-llama/synthetic-data-kit

pageman/sutskever-30-implementations

langroid/langroid

ConardLi/easy-dataset

IBM/mcp-context-forge

chiphuyen/aie-book

thunlp/UltraChat

PlexPt/chatgpt-corpus

apache/incubator-airflow

matz/streem

coleam00/Archon

NVlabs/ffhq-dataset