# alibaba/otter

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/alibaba-otter).**

8,127 stars · 2,462 forks · Java · Apache-2.0

## Links

- GitHub: https://github.com/alibaba/otter
- awesome-repositories: https://awesome-repositories.com/repository/alibaba-otter.md

## Description

Otter is a distributed database synchronization system and change data capture tool designed to replicate data between databases across multiple geographic regions. It functions as a synchronization orchestrator and ETL data pipeline that mirrors records and associated files in real time.

The system employs incremental log parsing to capture database changes and utilizes a consistency-based convergence algorithm and loop-avoidance logic to manage bi-directional replication. It processes data through a pipeline of selection, extraction, transformation, and loading to handle joins and format conversions before delivering records to target tables.

The platform includes a distributed coordination layer to manage worker node state and schedule large-scale synchronization tasks across remote data centers. Supporting capabilities cover synchronization health monitoring for tracking replication lag and throughput, as well as administrative access control for managing system configurations.

## Tags

### Data & Databases

- [Database Replication Tools](https://awesome-repositories.com/f/data-databases/database-replication-tools.md) — A distributed system designed for replicating database records across remote data centers with built-in loop avoidance.
- [Change Data Capture Tools](https://awesome-repositories.com/f/data-databases/change-data-capture-tools.md) — Captures and parses incremental database logs to mirror records and associated files in real time.
- [ETL Workflows](https://awesome-repositories.com/f/data-databases/data-pipeline-orchestration/etl-workflows.md) — Implements ETL workflows to extract data from sources and apply transformations before target delivery.
- [Data Replication](https://awesome-repositories.com/f/data-databases/data-replication.md) — Synchronizes data between databases using consistency algorithms and loop-avoidance to ensure convergence. ([source](https://github.com/alibaba/otter/wiki/Introduction))
- [Real-Time Data Replication](https://awesome-repositories.com/f/data-databases/database-replication/real-time-data-replication.md) — Parses incremental database logs to replicate data from source to target databases in near real-time. ([source](https://github.com/alibaba/otter#readme))
- [Distributed Data Synchronization Systems](https://awesome-repositories.com/f/data-databases/distributed-data-synchronization-systems.md) — Replicates data between databases across multiple geographic regions using incremental log parsing for near real-time consistency.
- [Regional Replication](https://awesome-repositories.com/f/data-databases/distributed-data-synchronization-systems/regional-replication.md) — Replicates database records across geographically distant data centers to improve local read performance.
- [Replicated Data Convergence](https://awesome-repositories.com/f/data-databases/data-replication/replicated-data-convergence.md) — Implements a consistency-based convergence algorithm to ensure final data agreement across disparate data centers.
- [Data Source Synchronizers](https://awesome-repositories.com/f/data-databases/data-synchronization/data-source-synchronizers.md) — Coordinates data flow by configuring data sources, parsing rules, and target tables. ([source](https://github.com/alibaba/otter/wiki/Adminguide))
- [Replication Loop Avoidance](https://awesome-repositories.com/f/data-databases/database-replication-tools/logical-replication-ingestion/replication-loop-avoidance.md) — Implements loop-avoidance logic to prevent infinite data loops during bi-directional synchronization.
- [Real-Time Data Streaming](https://awesome-repositories.com/f/data-databases/real-time-data-streaming.md) — Captures incremental database log changes and transforms them for near real-time updates in a data pipeline.

### DevOps & Infrastructure

- [Distributed Task Orchestrators](https://awesome-repositories.com/f/devops-infrastructure/distributed-task-orchestrators.md) — Provides a coordination layer to manage worker nodes and schedule large-scale data replication tasks across distributed environments.
- [Worker Node Management](https://awesome-repositories.com/f/devops-infrastructure/worker-node-management.md) — Manages a fleet of distributed worker nodes through a central controller that pushes configurations and monitors lag.

### Networking & Communication

- [Distributed Coordination Services](https://awesome-repositories.com/f/networking-communication/distributed-systems-p2p/distributed-systems-coordination/distributed-systems-infrastructure/distributed-coordination-services.md) — Provides a distributed coordination service to manage shared state and synchronize node configurations across geographic regions.

### Software Engineering & Architecture

- [Database Transaction Log Parsers](https://awesome-repositories.com/f/software-engineering-architecture/custom-log-formatting/log-parsing/database-transaction-log-parsers.md) — Parses database redo logs to extract data changes for replication without impacting source database performance.
- [Data Transformation Pipelines](https://awesome-repositories.com/f/software-engineering-architecture/data-transformation-pipelines.md) — Processes database records through a multi-stage pipeline of selection, extraction, transformation, and loading.
- [Distributed Cluster Coordination](https://awesome-repositories.com/f/software-engineering-architecture/distributed-cluster-coordination.md) — Coordinates nodes across multiple geographic regions using a shared source of truth to optimize read efficiency. ([source](https://github.com/alibaba/otter/wiki/Introduction))

### System Administration & Monitoring

- [Replication Health Monitors](https://awesome-repositories.com/f/system-administration-monitoring/monitoring-and-observability/observability-platforms/operational-health-alerting/health-monitoring-endpoints/application-health-monitors/replication-health-monitors.md) — Tracks replication lag, throughput, and progress to ensure data consistency across geographically distributed environments. ([source](https://github.com/alibaba/otter/wiki/Adminguide))
