1 repo
Techniques for reducing memory footprint by mapping repeated values to numeric identifiers.
Distinguishing note: Focuses on memory-efficient storage representations rather than general data compression.
Explore 1 awesome GitHub repository matching data & databases · Data Encoding Optimizations. Refine with filters or upvote what's useful.
Polars is a high-performance columnar data processing library designed for efficient analytical workflows. It functions as a structured data library that organizes information into typed columns, utilizing the Apache Arrow memory format to enable zero-copy data sharing and cache-friendly, vectorized operations. The engine is built to handle large-scale tabular datasets, providing both local and distributed analytical runtimes that scale from single-machine environments to multi-node clusters. The project distinguishes itself through a sophisticated lazy query engine that constructs abstract e
Optimizes memory usage by representing repeated string data as numeric placeholders.