Connect is a Kafka data integration platform and stream processing engine used to build declarative pipelines that move and transform messages between Kafka topics and external sources. It functions as a Kafka Connect framework and a change data capture tool, streaming real-time database modifications to synchronize data across distributed environments.
The project differentiates itself through a dedicated mapping language for mutating and reshaping message payloads and the ability to execute custom processing logic within a sandboxed WebAssembly runtime. It also provides an observability pipeline that exports metrics and execution traces using the OpenTelemetry standard.
The system covers a broad range of integration capabilities, including cloud data warehousing for services like BigQuery and Iceberg, as well as SQL data management and cloud storage integration. It supports advanced data operations such as Grok text processing, schema registry integration, and broker message routing for distributing data to multiple outputs.
Configuration is managed through structured files, with available utilities for configuration schema validation and natural language pipeline generation.