# Reverse ETL Data Sync Tools

> Search results for `reverse-ETL tool that syncs warehouse data back to business apps` on awesome-repositories.com. 118 total matches; showing the first 50.

Explore on the web: https://awesome-repositories.com/q/reverse-etl-tool-that-syncs-warehouse-data-back-to-business-apps

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [this search on awesome-repositories.com](https://awesome-repositories.com/q/reverse-etl-tool-that-syncs-warehouse-data-back-to-business-apps).**

## Results

- [charles2gan/gda-android-reversing-tool](https://awesome-repositories.com/repository/charles2gan-gda-android-reversing-tool.md) (4,778 ⭐) — This project is a comprehensive Android reverse engineering suite that functions as a decompiler, bytecode deobfuscator, and malware analysis tool. It is designed to convert APK, DEX, and OAT binaries into human-readable source code using a native implementation that does not require a Java Virtual Machine.

The platform is distinguished by its integration with Frida for dynamic analysis, allowing users to hook methods, inject custom JavaScript, and dump device memory in real time. It also features specialized security engines, including a taint propagation engine and a stack-state machine, to
- [datahub-project/datahub](https://awesome-repositories.com/repository/datahub-project-datahub.md) (12,141 ⭐) — DataHub is a metadata management platform designed to unify technical, operational, and business context across diverse data ecosystems. By utilizing a graph-based metadata model and an event-driven ingestion architecture, it creates a centralized source of truth that maps complex data relationships, lineage, and ownership. This foundational framework enables organizations to maintain a synchronized view of their data landscape, supporting both human-led discovery and automated data operations.

The platform distinguishes itself through its focus on grounding artificial intelligence and autono
- [packtpublishing/llm-engineers-handbook](https://awesome-repositories.com/repository/packtpublishing-llm-engineers-handbook.md) (4,774 ⭐) — This project is an educational resource and engineering guide for building, deploying, and optimizing large language model applications and production pipelines. It serves as a blueprint for cloud AI infrastructure, providing a framework for orchestrating inference endpoints, data warehouses, and scalable production environments.

The repository provides specific implementation patterns for retrieval augmented generation to ground model responses in external data. It includes a training workflow for crawling, structuring, and processing datasets to facilitate model fine-tuning, alongside an ev
- [blockchain-etl/ethereum-etl](https://awesome-repositories.com/repository/blockchain-etl-ethereum-etl.md) (3,133 ⭐) — Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
- [airbytehq/airbyte](https://awesome-repositories.com/repository/airbytehq-airbyte.md) (21,472 ⭐) — Airbyte is a data integration platform designed to synchronize information between diverse applications, databases, and data warehouses. It functions as an extract, transform, and load orchestrator that manages automated data movement workflows across cloud, on-premise, and hybrid environments. The platform provides a standardized interface for connectors, enabling the movement of structured and unstructured data while maintaining stateful checkpoints for reliable incremental syncing.

The platform distinguishes itself through a containerized architecture that isolates connectors to prevent de
- [helicone/helicone](https://awesome-repositories.com/repository/helicone-helicone.md) (5,830 ⭐) — Helicone is an AI gateway and observability platform designed to intercept, manage, and monitor interactions with large language models. By acting as a reverse-proxy, it provides a centralized layer for routing requests across multiple AI providers, allowing developers to maintain consistent application logic while gaining deep visibility into model performance, usage, and costs.

The platform distinguishes itself through a robust suite of traffic management and prompt engineering tools. It enables policy-driven control, including automatic failover between providers, rate limiting, and edge-b
- [turboway/bigdata_analyse](https://awesome-repositories.com/repository/turboway-bigdata-analyse.md) (5,238 ⭐) — This project is a collection of big data frameworks and pipelines, including an Apache Hive analysis framework, a behavioral data analytics platform, a predictive analytics engine, and real-time data pipelines. It provides the infrastructure for building Extract, Transform, Load (ETL) workflows to process large datasets for distributed storage and SQL-based analysis.

The system supports diverse analytical implementations, such as a predictive engine using linear regression for value forecasting and a real-time architecture that moves data through message brokers for immediate reporting. It in
- [3lvis/sync](https://awesome-repositories.com/repository/3lvis-sync.md) (2,543 ⭐) — JSON to SwiftData and back. SwiftData Sync.
- [clickhouse/clickhouse](https://awesome-repositories.com/repository/clickhouse-clickhouse.md) (48,229 ⭐) — ClickHouse is a high-performance, columnar analytical database designed for real-time query execution and large-scale data aggregation. It functions as a distributed data warehouse capable of processing petabytes of information, while also providing an embedded engine that integrates directly into applications for native query capabilities without external dependencies. The system is built to handle high-throughput ingestion and complex analytical workloads, delivering millisecond-level latency for interactive dashboards and operational monitoring.

The platform distinguishes itself through ad
- [souravroy-etl/duckle](https://awesome-repositories.com/repository/souravroy-etl-duckle.md) (504 ⭐) — Local-first ETL/ELT studio: a drag-and-drop visual pipeline designer that compiles to SQL and runs on DuckDB. Tiny desktop app, no servers, git-friendly workspaces.
- [debezium/debezium](https://awesome-repositories.com/repository/debezium-debezium.md) (12,421 ⭐) — Debezium is a distributed change data capture platform that streams row-level database modifications as real-time events. By parsing database transaction logs, the system broadcasts structural and data changes to message brokers, enabling reactive processing and data integration across distributed architectures.

The platform utilizes log-based capture to extract modifications directly from transaction logs, ensuring minimal impact on source system performance while maintaining the original commit order of operations. It employs database-specific connector adapters to translate proprietary bin
- [dlt-hub/dlt](https://awesome-repositories.com/repository/dlt-hub-dlt.md) (5,472 ⭐) — dlt is a Python data ingestion tool and ETL pipeline framework designed to fetch data from diverse sources and persist it into structured destinations. It functions as a schema inference engine that automatically detects data types and flattens nested JSON structures into relational tables, moving data from sources to lakehouses, warehouses, or vector databases.

The project distinguishes itself through AI-powered pipeline generation, using large language models to scaffold extraction code and connectors for REST APIs. It also supports multimodal vector storage and specialized population of ve
- [supabase/realtime](https://awesome-repositories.com/repository/supabase-realtime.md) (7,488 ⭐) — Realtime is a real-time data distribution and synchronization engine that enables applications to stream database changes and coordinate state between clients. It functions as a synchronization layer that monitors database write-ahead logs to provide change data capture and pushes updates to authorized clients via WebSockets.

The project features a real-time presence server for tracking the online status of active users and a broadcast service for sending ephemeral messages without database persistence. It organizes communication through channel-based message routing and uses a structured JSO
- [fulldecent/google-sheets-etl](https://awesome-repositories.com/repository/fulldecent-google-sheets-etl.md) (22 ⭐) — Live import all your Google Sheets to your data warehouse
- [jitsucom/jitsu](https://awesome-repositories.com/repository/jitsucom-jitsu.md) (4,782 ⭐) — Jitsu is a customer data platform designed for collecting, transforming, and routing application events to data warehouses and marketing tools. It functions as an event ingestion engine and data warehouse router, capturing behavioral data via APIs and SDKs for real-time processing and storage.

The platform features a programmable JavaScript data pipeline that allows for the filtering, enrichment, and reshaping of event data during transit. It includes a customer identity stitcher that merges anonymous and known user identifiers to maintain persistent customer profiles within a warehouse.

The
- [estuary/estuary-warehouse-benchmark](https://awesome-repositories.com/repository/estuary-estuary-warehouse-benchmark.md) (2 ⭐) — 👉 Check out the full report here: https://estuary.dev/data-warehouse-benchmark-report/
- [treeverse/lakefs](https://awesome-repositories.com/repository/treeverse-lakefs.md) (5,406 ⭐) — lakeFS is a data lake versioning system that provides Git-like branching and commits for large datasets stored in object storage. It functions as a version control layer, enabling the creation of immutable snapshots, atomic commits, and zero-copy branching to create isolated environments for data experimentation without duplicating physical files.

The system serves as an S3-compatible storage gateway and an Iceberg REST catalog, allowing standard cloud storage protocols and compatible clients to manage versioned tables. It acts as a data quality gatekeeper by using an event-driven hook system
- [plotly/plotly.py](https://awesome-repositories.com/repository/plotly-plotly-py.md) (18,270 ⭐) — Plotly.py is a comprehensive framework for building production-ready data applications and interactive dashboards directly from Python code. It functions as both a high-performance visualization library for browser-based charts and a full-stack tool for transforming analytical scripts into responsive, web-based interfaces. By abstracting away the need for manual HTML or JavaScript, it allows developers to define complex layouts and functional logic using modular, reusable components.

The framework distinguishes itself through a robust architecture that handles event orchestration and state sy
- [pawl/awesome-etl](https://awesome-repositories.com/repository/pawl-awesome-etl.md) (3,565 ⭐) — A curated list of notable ETL (extract, transform, load) frameworks, libraries and software.
- [dagster-io/dagster](https://awesome-repositories.com/repository/dagster-io-dagster.md) (14,974 ⭐) — Dagster is a data orchestration platform designed to manage the entire lifecycle of data assets through declarative modeling and version-controlled code. It functions as a workflow engine that treats data assets as first-class primitives, allowing teams to define, schedule, and monitor complex pipelines while maintaining clear visibility into lineage, dependencies, and data quality.

The platform distinguishes itself by using a code-as-configuration framework that enables standard software engineering practices, such as unit testing and local mocking, to be applied directly to data workflows.
- [plausible/analytics](https://awesome-repositories.com/repository/plausible-analytics.md) (24,245 ⭐) — This project is an open-source, privacy-focused web analytics platform designed for high-throughput data ingestion and multi-tenant data management. It provides a cookie-less tracking engine that captures visitor interactions using ephemeral request metadata, ensuring comprehensive traffic visibility while maintaining strict privacy standards. The architecture utilizes an event-driven ingestion pipeline and aggregated metric storage to decouple data collection from processing, enabling efficient long-term retrieval and responsive dashboard performance.

What distinguishes this platform is its
- [vindarel/languages-that-compile-to-python](https://awesome-repositories.com/repository/vindarel-languages-that-compile-to-python.md) (280 ⭐) — List of languages that compile to python
- [edtechre/pybroker](https://awesome-repositories.com/repository/edtechre-pybroker.md) (3,191 ⭐) — pybroker is a Python algorithmic trading framework and quantitative technical analysis library designed for developing, testing, and optimizing trading strategies using historical market data. It functions as a trading strategy backtester and a financial performance evaluator, providing a structured environment to simulate trading rules and analyze their statistical reliability.

The framework distinguishes itself through a market data integration layer that handles the fetching and caching of historical price data from external providers. It incorporates an event-driven backtesting engine and
- [abraunegg/onedrive](https://awesome-repositories.com/repository/abraunegg-onedrive.md) (12,577 ⭐) — This project is a command-line synchronization client for OneDrive and SharePoint libraries on Linux. It functions as a synchronization engine that aligns local filesystems with cloud storage through bidirectional, unidirectional, or download-only workflows.

The client supports headless authentication for servers without web browsers and can be deployed as a background service or within a containerized environment. It enables the management of multiple distinct cloud accounts on a single system and integrates with shared SharePoint sites and document libraries.

The synchronization engine inc
- [mementum/backtrader](https://awesome-repositories.com/repository/mementum-backtrader.md) (20,462 ⭐) — Backtrader is a Python framework designed for the development, backtesting, and live execution of algorithmic trading strategies. It provides a comprehensive environment for quantitative finance, allowing users to simulate trading logic against historical market data or connect directly to brokerage platforms for automated real-time trading.

The project distinguishes itself through a unified event-driven architecture that treats backtesting and live trading with the same API. This consistency is supported by a flexible data-feed abstraction layer that normalizes diverse financial sources, ena
- [universaldatatool/universal-data-tool](https://awesome-repositories.com/repository/universaldatatool-universal-data-tool.md) (2,068 ⭐) — Collaborate & label any type of data, images, text, or documents, in an easy web interface or desktop app.
- [maplibre/maplibre-gl-js](https://awesome-repositories.com/repository/maplibre-maplibre-gl-js.md) (9,931 ⭐) — Maplibre GL JS is a WebGL map rendering engine and vector tile map library used to create interactive web maps. It serves as a web-based GIS visualization tool and an interactive map interface framework for rendering geographic data and vector tiles on web pages.

The library provides capabilities for 3D terrain rendering and the integration of custom 3D content. It supports complex geospatial data visualization through the use of heatmaps, clusters, and 3D extrusions, while allowing for custom map styling and environmental effect customization.

The system covers a broad range of functional a
- [windofshadow/that](https://awesome-repositories.com/repository/windofshadow-that.md) (121 ⭐) — This repository contains the Pytorch implementation of the THAT methods in the following paper:
- [garden-co/jazz](https://awesome-repositories.com/repository/garden-co-jazz.md) (2,537 ⭐) — Jazz is a local-first relational database and synchronization framework designed for offline-capable applications. It functions as a reactive state management system that treats database records as the primary source of truth, automatically updating user interfaces in real time as underlying data changes.

The project distinguishes itself through a collaborative data synchronization model that utilizes row-level versioning to track branching edit histories. It implements a security engine based on identity-claim row security, which enforces granular permissions on individual records, and suppo
- [sfu-db/connector-x](https://awesome-repositories.com/repository/sfu-db-connector-x.md) (2,561 ⭐) — Connector-X is a high-performance SQL data extraction library and bridge for transferring relational database records into memory-efficient data structures. It functions as a parallel database connector and federated query engine capable of executing and joining queries across multiple remote database connections to aggregate data locally.

The project distinguishes itself through a zero-copy approach to data loading, which transfers SQL query results into memory structures without duplicating data. It maximizes throughput by partitioning SQL queries into threads, employing parallel columnar a
- [kylekatarnls/business-time](https://awesome-repositories.com/repository/kylekatarnls-business-time.md) (318 ⭐) — Carbon mixin to handle business days and opening hours
- [dbt-labs/dbt-core](https://awesome-repositories.com/repository/dbt-labs-dbt-core.md) (13,051 ⭐) — dbt-core is a command-line framework for transforming data within a warehouse using modular SQL and version control. It functions as a data transformation engine that enables users to define data structures and business logic through declarative configuration files, which the system then compiles into executable code. By managing complex data dependencies through a directed acyclic graph, it ensures that transformation tasks execute in the correct order while maintaining a manifest-driven state to track lineage and execution history.

The project distinguishes itself through an adapter-based d
- [github/docs](https://awesome-repositories.com/repository/github-docs.md) (18,951 ⭐) — GitHub Copilot is an AI-powered development platform designed to integrate large language models directly into coding environments. It functions as an interactive assistant and an agentic workflow orchestrator, enabling developers to automate code generation, perform automated code reviews, and execute complex, multi-step development tasks through natural language prompts.

The platform distinguishes itself through its autonomous agent capabilities, which allow for repository-level research, implementation planning, and code modifications across multiple files. It supports a modular architectu
- [bitwarden/server](https://awesome-repositories.com/repository/bitwarden-server.md) (18,074 ⭐) — This project provides a comprehensive, self-hosted platform for zero-knowledge credential management and enterprise secrets orchestration. It functions as a secure vault that ensures all encryption and decryption processes occur exclusively on the client side, preventing the server from ever accessing plaintext data. By combining identity federation with robust access controls, the system enables organizations to centralize the management of passwords, passkeys, and sensitive infrastructure credentials.

The platform distinguishes itself through its focus on both human-centric security and aut
- [red-data-tools/youplot](https://awesome-repositories.com/repository/red-data-tools-youplot.md) (4,761 ⭐) — YouPlot is a command line plotting utility and terminal data visualization tool used to render statistical plots and charts directly within a terminal interface using Unicode characters. It functions as a Unix pipeline plotter, allowing users to visualize numerical data without leaving the shell.

The project operates as a real-time data visualizer, drawing plots progressively as data streams into the system. It integrates into command line pipelines by reading data from standard input to provide real-time stream monitoring and data analysis.

The tool covers a variety of rendering capabilitie
- [caddyserver/caddy](https://awesome-repositories.com/repository/caddyserver-caddy.md) (73,492 ⭐) — Caddy is an extensible, modular web server platform designed for high-performance traffic management and automated security. At its core, it functions as a dynamic HTTP gateway that handles request routing, static asset delivery, and reverse proxying through a chain of configurable handler modules. The system is built on a modular architecture that allows developers to extend server functionality by registering custom components, all managed through a unified lifecycle and provisioning framework.

What distinguishes Caddy is its focus on automated infrastructure and zero-downtime operations. I
- [randomfractals/pro-data-tools](https://awesome-repositories.com/repository/randomfractals-pro-data-tools.md) (41 ⭐) — Random Fractals Inc. Data Tools 🛠️ is a collection of public data visualization extensions, data viewers, VS Code Notebook renderers, and code snippets for devs and data scientists using VS Code IDE, published under our Random Fractals Inc. ☂️ org.
- [openbb-finance/openbb](https://awesome-repositories.com/repository/openbb-finance-openbb.md) (69,583 ⭐) — OpenBB is a financial data platform and investment research terminal designed to aggregate, normalize, and distribute market data across analytical workflows. It functions as a comprehensive ecosystem that bridges disparate financial data providers with custom applications, spreadsheets, and internal modeling infrastructure.

The platform distinguishes itself through a provider-based data abstraction layer that normalizes heterogeneous financial APIs into a consistent, schema-driven format. This architecture supports quantitative research automation and the construction of interactive, widget-
- [kamranahmedse/developer-roadmap](https://awesome-repositories.com/repository/kamranahmedse-developer-roadmap.md) (357,434 ⭐) — Developer Roadmap is a community-driven platform that provides structured, graph-based learning paths for software engineering. It serves as a comprehensive knowledge repository where technical domains are organized into visual sequences to guide professional skill acquisition and career growth.

The project distinguishes itself through a collaborative ecosystem that enables users to contribute roadmaps, curate industry best practices, and maintain professional profiles. It integrates diagnostic assessment frameworks to evaluate technical proficiency, helping developers identify knowledge gaps
- [keen/dashboards](https://awesome-repositories.com/repository/keen-dashboards.md) (11,038 ⭐) — This project is a collection of responsive CSS Grid dashboard templates and a data visualization UI kit. It provides a set of HTML layouts designed for building analytics interfaces and monitoring views for KPIs and business metrics that adapt to different screen sizes.

The toolkit is library-agnostic, allowing the connection of static HTML templates to any external data source or third-party charting library without requiring custom adapter code. It uses a template-driven approach to separate the visual structure of the dashboard from the underlying data.

The capabilities cover the assembly
- [business-science/ai-data-science-team](https://awesome-repositories.com/repository/business-science-ai-data-science-team.md) (4,805 ⭐) — This project is a platform that orchestrates multiple AI agents to automate data science workflows—covering data loading, cleaning, feature engineering, modeling, and querying. It also functions as a natural language database query interface, converting plain English questions into SQL, and as a visual data pipeline builder.

Custom agents are generated on demand by filling prompt templates for tasks like data cleaning and feature engineering. Pipelines incorporate human-in-the-loop checkpoints that pause execution for review and approval. Intermediate results are saved as versioned files, ena
- [marmelab/react-admin](https://awesome-repositories.com/repository/marmelab-react-admin.md) (26,780 ⭐) — React-admin is a framework for building data-driven administrative interfaces that connect to REST or GraphQL backends. It provides a comprehensive suite of tools for managing the full lifecycle of administrative applications, including resource-oriented routing, declarative form scaffolding, and context-driven state management. By utilizing a modular adapter-based architecture, the framework abstracts backend communication, allowing developers to build consistent CRUD interfaces that handle data fetching, authentication, and synchronization automatically.

The project distinguishes itself thr
- [electric-sql/electric](https://awesome-repositories.com/repository/electric-sql-electric.md) (9,909 ⭐) — Electric is a Postgres data synchronization engine and replication proxy designed to enable local-first software. It replicates data from Postgres databases to client-side stores in real time using logical replication, allowing applications to maintain a local embedded database for offline access and low-latency updates.

The system distinguishes itself by using shapes to filter and authorize specific subsets of database rows and columns before streaming them to clients or edge workers. It further supports multi-user collaboration by integrating a conflict-free replicated data type framework t
- [pret/pokemon-reverse-engineering-tools](https://awesome-repositories.com/repository/pret-pokemon-reverse-engineering-tools.md) (351 ⭐) — Tools for building and disassembling Pokémon Red and Pokémon Crystal
- [panzerschrek/chasm-reverse](https://awesome-repositories.com/repository/panzerschrek-chasm-reverse.md) (176 ⭐) — Tools for reverse-engineering of game "Chasm: The Rift"
- [ankitects/anki](https://awesome-repositories.com/repository/ankitects-anki.md) (28,571 ⭐) — Anki is a cross-platform flashcard management system designed to optimize long-term memory retention through spaced-repetition learning. It functions as a digital learning assistant that uses active recall practice and automated scheduling algorithms to determine the ideal timing for card reviews based on individual performance history. The core system relies on a local relational database to ensure data persistence and portability, while supporting complex study workflows through flexible note-type schema modeling and template-driven content rendering.

The platform distinguishes itself throu
- [ory/keto](https://awesome-repositories.com/repository/ory-keto.md) (5,270 ⭐) — Ory Keto is an open-source authorization server that implements Google Zanzibar’s relationship-based access control model. It stores every access relationship as a tuple in a SQL database and exposes a declarative TypeScript-like namespace language for defining object types, relations, and permissions. The service provides bidirectional permission resolution, configurable consistency levels for checks, and dual gRPC and REST APIs for broad integration.

Keto extends the Zanzibar model with edge enforcement of access policies, structured compliance auditing of permission decisions, and infrastr
- [build-trust/ockam](https://awesome-repositories.com/repository/build-trust-ockam.md) (4,628 ⭐) — Ockam is a zero-trust networking framework designed to secure data transit between distributed applications using an identity-based network overlay. It provides the primitives necessary to establish mutually authenticated and end-to-end encrypted connections, removing the reliance on traditional network-layer security.

The project is distinguished by its use of attribute-based access control and verifiable credentials to manage trust at scale. It implements cryptographic identity rotation to maintain identity continuity and integrates with hardware-backed key management systems to secure priv
- [dfsramos/wezterm-sync](https://awesome-repositories.com/repository/dfsramos-wezterm-sync.md) (2 ⭐) — A WezTerm plugin that syncs your config to a private GitHub Gist — keeping it in sync across multiple machines with no dotfiles setup required.
- [chartdb/chartdb](https://awesome-repositories.com/repository/chartdb-chartdb.md) (21,286 ⭐) — ChartDB is a database schema visualizer and entity-relationship diagramming platform designed to help developers understand, design, and document complex data architectures. It functions as a visual workspace where users can create and modify database schemas, define table attributes, and map foreign key relationships. By parsing database metadata or SQL scripts, the tool generates interactive diagrams that provide a clear overview of structural interdependencies and data associations.

The platform distinguishes itself through its focus on automated documentation and schema synchronization. I