2 个仓库
Using SQL statements to define and manage change data capture sources.
Distinct from SQL Statement Executions: Specifically relates to using SQL for CDC configuration, which is distinct from general SQL query execution or ML integration.
Explore 2 awesome GitHub repositories matching data & databases · SQL-Based CDC Integrations. Refine with filters or upvote what's useful.
This project is a streaming data integration framework that captures real-time database changes and synchronizes them with downstream systems. It operates as a distributed streaming ETL and database synchronizer, reading database logs and snapshots to propagate row-level modifications to target sinks. The system supports declarative data integration, allowing users to define source-to-sink data flows using SQL or YAML configurations. It distinguishes itself by automating schema evolution to maintain synchronization when source structures change and ensuring exactly-once delivery and processin
Defines change data capture sources using SQL statements to query and process database changes.
Chunjun 是一个分布式数据集成框架和基于 SQL 的 ETL 流水线,旨在实现异构数据源之间的数据同步。它作为一款变更数据捕获(CDC)工具和异构数据同步器,利用分布式处理环境在不同数据库类型之间迁移和转换数据。 该系统的特色在于其基于插件的连接器架构,允许开发自定义源和目标插件,以扩展对非原生支持数据系统的连接。它支持从关系型数据库日志中进行实时变更数据捕获,并实现模式演进传播,自动将结构变更从源表应用到目标表。 该框架提供了增量数据同步和使用 SQL 逻辑进行跨源数据计算的能力。可靠性通过基于检查点的任务恢复机制来管理,以恢复中断的传输,并利用死信队列进行脏数据管理,以审计格式错误的数据记录。 集成任务可部署在独立集群、Yarn 或 Kubernetes 环境中,并支持通过 Docker 进行容器化部署。
Enables the definition of data integration and CDC workflows using SQL scripts compatible with streaming syntax.