What are the main features of dtstack/chunjun?

The main features of dtstack/chunjun are: Distributed Data Processing Frameworks, Heterogeneous Data Synchronization, Change Data Capture, Change Data Capture Tools, Checkpoints and Recovery, Distributed Cluster Execution, Incremental Data Synchronization, SQL-Based Pipeline Definitions.

What are some open-source alternatives to dtstack/chunjun?

Open-source alternatives to dtstack/chunjun include: hazelcast/hazelcast — Hazelcast is a distributed data platform that combines an in-memory data grid with a stream processing engine to… apache/flink-cdc — This project is a streaming data integration framework that captures real-time database changes and synchronizes them… alibaba/datax — DataX is a distributed data integration framework and plugin-based ETL tool designed for synchronizing large datasets… risingwavelabs/risingwave — RisingWave is a cloud-native streaming database and real-time analytics engine that uses standard SQL to process… dlt-hub/dlt — dlt is a Python data ingestion tool and ETL pipeline framework designed to fetch data from diverse sources and persist… jerrylead/sparkinternals — SparkInternals is a technical reference and architecture guide detailing the internal design and implementation of the…

DTStackchunjun

Chunjun

Chunjun is a distributed data integration framework and SQL-based ETL pipeline designed to synchronize data between heterogeneous sources. It functions as a change data capture tool and a heterogeneous data synchronizer, utilizing a distributed processing environment to move and transform data across different database types.

The system is distinguished by its plugin-based connector architecture, which allows for the development of custom source and sink plugins to extend connectivity to unsupported data systems. It supports real-time change data capture from relational database logs and implements schema evolution propagation to automatically apply structural changes from source to destination tables.

The framework provides capabilities for incremental data synchronization and cross-source data calculation using SQL logic. Reliability is managed through checkpoint-based task recovery to resume interrupted transfers and dead-letter queues for dirty data management to audit malformed records.

Integration tasks can be deployed across standalone clusters, Yarn, or Kubernetes environments, with support for containerized deployment via Docker.

Features

Distributed Data Processing Frameworks - Provides a distributed framework for synchronizing and transforming data between heterogeneous sources using a plugin-based architecture.
Heterogeneous Data Synchronization - Transfers and aligns data between different heterogeneous data sources using a distributed integration framework.
Change Data Capture - Streams real-time updates from relational database logs to enable low-latency synchronization between heterogeneous systems.
Change Data Capture Tools - Collects data from relational databases in real-time via logs to facilitate low-latency synchronization.

Open-source alternatives to Chunjun

Similar open-source projects, ranked by how many features they share with Chunjun.

hazelcast/hazelcast
hazelcast/hazelcast
6,570View on GitHub
Hazelcast is a distributed data platform that combines an in-memory data grid with a stream processing engine to support real-time analytics and event-driven applications. It functions as a partitioned, distributed key-value store that replicates data across cluster nodes to provide low-latency access and high availability. The platform also serves as a distributed SQL query engine, allowing users to execute standard SQL statements against both in-memory datasets and external data sources. What distinguishes Hazelcast is its use of a distributed consensus subsystem to maintain strongly consis
Javabig-datacachingdata-in-motion
View on GitHub6,570
apache/flink-cdc
apache/flink-cdc
6,430View on GitHub
This project is a streaming data integration framework that captures real-time database changes and synchronizes them with downstream systems. It operates as a distributed streaming ETL and database synchronizer, reading database logs and snapshots to propagate row-level modifications to target sinks. The system supports declarative data integration, allowing users to define source-to-sink data flows using SQL or YAML configurations. It distinguishes itself by automating schema evolution to maintain synchronization when source structures change and ensuring exactly-once delivery and processin
Javabatchcdcchange-data-capture
View on GitHub6,430
alibaba/datax
alibaba/DataX
17,241View on GitHub
DataX is a distributed data integration framework and plugin-based ETL tool designed for synchronizing large datasets between heterogeneous sources and destinations. It functions as a JDBC data migration engine and offline synchronization tool, enabling the movement of data between relational databases, NoSQL stores, and object storage. The system utilizes a plugin-based connector architecture that decouples reader and writer logic, allowing it to map and transform data types across different storage engines using a standardized internal representation. This design supports heterogeneous data
Java
View on GitHub17,241
risingwavelabs/risingwave
risingwavelabs/risingwave
9,093View on GitHub
RisingWave is a cloud-native streaming database and real-time analytics engine that uses standard SQL to process continuous data streams. It functions as a streaming data lakehouse, combining the capabilities of a streaming SQL database with a platform that integrates streaming ingestion with open table formats. The system is distinguished by its use of the PostgreSQL wire protocol, allowing it to integrate with existing SQL tools and drivers. It employs a decoupled compute and storage architecture, persisting streaming state and materialized views in cloud object storage to enable independen
Rustapache-icebergdata-engineeringdatabase
View on GitHub9,093

See all 30 alternatives to Chunjun

Frequently asked questions

What does dtstack/chunjun do?

Chunjun

Features

Open-source alternatives to Chunjun

hazelcast/hazelcast

apache/flink-cdc

alibaba/DataX

risingwavelabs/risingwave

Frequently asked questions

Star history

Open-source alternatives to Chunjun

hazelcast/hazelcast

apache/flink-cdc

alibaba/DataX

risingwavelabs/risingwave

Frequently asked questions