DuckDB is an embedded, in-process analytical SQL database and OLAP database management system. It functions as a data engine for Parquet and CSV files, allowing users to execute complex SQL queries on large datasets without requiring a separate server process.
The system is designed for local analytical processing and embedded data science workflows. It enables the direct querying and analysis of Parquet and CSV files from disk, bypassing the need to load data into a permanent database.
The engine provides high-performance analytical SQL execution, including support for window functions and nested subqueries. It incorporates a columnar storage layout and vectorized query execution to handle large-scale data manipulation and exploration.
The database is accessible via a standalone command line interface and language-specific bindings for Python, R, Java, and Wasm.