1 repo
Provides mechanisms to install and manage software packages for remote data processing environments.
Distinguishing note: Specifically handles package management for distributed compute nodes, not local library installation.
Explore 1 awesome GitHub repository matching data & databases · Distributed Extensions. Refine with filters or upvote what's useful.
Polars is a high-performance columnar data processing library designed for efficient analytical workflows. It functions as a structured data library that organizes information into typed columns, utilizing the Apache Arrow memory format to enable zero-copy data sharing and cache-friendly, vectorized operations. The engine is built to handle large-scale tabular datasets, providing both local and distributed analytical runtimes that scale from single-machine environments to multi-node clusters. The project distinguishes itself through a sophisticated lazy query engine that constructs abstract e
Extends local data processing capabilities with remote cloud execution features using standard package management.