2 مستودعات
Processes to populate missing values in existing dataset columns using defined functions.
Distinguishing note: None of the candidates address the specific process of filling missing values in existing rows via function execution.
Explore 2 awesome GitHub repositories matching data & databases · Data Backfilling. Refine with filters or upvote what's useful.
LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The system distinguishes itself through a combination of cloud-native object storage and immutable version tracking, allowing for data time-travel and reproducible AI experiments. It integrates hybrid search capabilities, merging dense vector similarity with BM25 full-text search and SQL-like scalar filters
Updates rows lacking values in target columns by executing defined functions to fill the gaps.
pgroll is a PostgreSQL migration framework designed for zero-downtime schema changes. It applies non-blocking DDL operations that avoid exclusive locks on tables, and uses trigger-based column backfill to populate new columns while keeping them synchronized with old ones. The framework wraps each migration step in a database transaction that can be atomically committed or rolled back, and creates a versioned view layer that exposes both old and new schema versions simultaneously to client applications. The tool distinguishes itself by managing multiple schema versions via views, enabling safe
Fills new columns from existing data and keeps them synchronized via triggers during migrations.