1 repo
Optimized pipelines for loading and transforming large datasets.
Distinguishing note: Focuses on performance-oriented ingestion rather than general data access.
Explore 1 awesome GitHub repository matching data & databases · High-Performance Ingestion. Refine with filters or upvote what's useful.
DuckDB is an in-process analytical database engine designed to run directly within an application process. As a zero-dependency, embedded system, it provides enterprise-grade SQL data processing capabilities without the overhead of managing a dedicated database server. It is built to handle complex analytical and aggregation tasks by storing and retrieving information in columns, allowing for high-performance relational data manipulation. The engine distinguishes itself through a columnar vectorized execution model that maximizes CPU cache efficiency during query operations. It employs adapti
Loads and transforms massive volumes of data using efficient bulk operations and schema inference.