analysiscenterbatchflow

View on GitHub

Batchflow

Features

Data Pipelines - Tools for working with random or sequential data batches.

Open-source alternatives to Batchflow

Similar open-source projects, ranked by how many features they share with Batchflow.

akfamily/akshare
akfamily/akshare
16,358View on GitHub
This project is a Python library designed for the programmatic retrieval and analysis of diverse financial datasets. It functions as a comprehensive toolkit for quantitative research, providing a unified interface to fetch historical and real-time market data across asset classes including equities, futures, bonds, cryptocurrencies, and foreign exchange. By abstracting complex network requests into simple, parameter-driven functions, it enables users to integrate financial data into research workflows and automated trading systems. The library distinguishes itself through its scraper-based ag
Pythonacademicakshareasset-pricing
View on GitHub16,358
apache/airflow
apache/airflow
45,902View on GitHub
Airflow is a platform for programmatically authoring, scheduling, and monitoring complex data pipelines. It functions as a workflow automation engine that manages the lifecycle of recurring business processes by executing code-defined task dependencies. By representing workflows as directed acyclic graphs, the system ensures that task execution order and data flow are explicitly defined and reliably maintained across distributed computing environments. The platform distinguishes itself through a highly modular, provider-based architecture that decouples core orchestration logic from external
Pythonairflowapacheapache-airflow
View on GitHub45,902
apache/beam
apache/beam
8,612View on GitHub
Apache Beam is a distributed data pipeline framework and unified data processing model designed to handle both bounded batch data and unbounded real-time streams. It provides a system for building scalable, data-parallel workflows that operate across compute clusters using a single programming model. The framework utilizes a cross-runner pipeline abstraction that decouples the data processing logic from the underlying execution backend, allowing the same pipeline to run on different distributed compute engines. It supports multi-language pipeline development by translating high-level code fro
Java
View on GitHub8,612
activeloopai/deeplake
activeloopai/deeplake
9,175View on GitHub
DeepLake is AI data infrastructure consisting of a multimodal data lake, a hybrid search engine, and a serverless vector database. It provides a PostgreSQL-based AI data runtime that combines multimodal storage with streaming pipelines to load and shuffle datasets from cloud storage directly into deep learning training pipelines. The system utilizes lazy indexing to store and slice images, audio, and video without loading entire files into memory. It enables retrieval-augmented generation by persisting high-dimensional embeddings in a serverless vector store and implementing hybrid search tha
C++agentagentic-ragai
View on GitHub9,175

See all 30 alternatives to Batchflow

Frequently asked questions

What are the main features of analysiscenter/batchflow?

The main features of analysiscenter/batchflow are: Data Pipelines.

What are some open-source alternatives to analysiscenter/batchflow?

Open-source alternatives to analysiscenter/batchflow include: akfamily/akshare — This project is a Python library designed for the programmatic retrieval and analysis of diverse financial datasets.… apache/airflow — Airflow is a platform for programmatically authoring, scheduling, and monitoring complex data pipelines. It functions… apache/beam — Apache Beam is a distributed data pipeline framework and unified data processing model designed to handle both bounded… apache/flume — Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving… apache/incubator-pulsar — Apache Pulsar is a cloud-native message queue and distributed publish-subscribe messaging system. It serves as a… activeloopai/deeplake — DeepLake is AI data infrastructure consisting of a multimodal data lake, a hybrid search engine, and a serverless…

Batchflow

Features

Open-source alternatives to Batchflow

akfamily/akshare

apache/airflow

apache/beam

activeloopai/deeplake

Frequently asked questions

Star history

Open-source alternatives to Batchflow

akfamily/akshare

apache/airflow

apache/beam

activeloopai/deeplake

Frequently asked questions