30 Repos
Components that provide filtering, sorting, and grouping functionality for tabular data.
Explore 30 awesome GitHub repositories matching data & databases · Table Data Managers. Refine with filters or upvote what's useful.
AppFlowy is a local-first knowledge base and collaborative workspace platform designed for structured information management. It functions as a modular productivity suite where users organize content through a block-based document model, allowing for flexible nesting and granular manipulation of data. The system prioritizes data sovereignty by enabling self-hosted storage, ensuring that sensitive information remains under user control while maintaining offline accessibility. The platform distinguishes itself through a decoupled architecture that separates its high-performance, memory-safe cor
Enables advanced filtering, sorting, and grouping of tabular data to help users visualize and analyze complex datasets.
NocoDB is a visual platform that transforms relational databases into collaborative, spreadsheet-style workspaces. By acting as a headless database backend, it provides a unified environment for designing database structures, managing record relationships, and interacting with data without requiring manual SQL queries. The platform normalizes interactions across various SQL and NoSQL data sources, allowing users to manage complex datasets through a centralized interface. The project distinguishes itself by automatically generating RESTful and GraphQL APIs from existing database schemas, enabl
Empowers users to filter, sort, and group tabular data for clearer analysis and efficient information management.
This project provides a Model Context Protocol server that enables autonomous agents to interact with and manage automation workflows. It functions as an integration layer, allowing language models to discover, build, test, and deploy complex automation sequences through natural language instructions and structured schema-based communication. The platform distinguishes itself by offering granular control over automation logic, including the ability to perform surgical, incremental patches to specific workflow nodes rather than replacing entire structures. It supports multi-instance connectivi
Performs CRUD operations and bulk updates on data tables to maintain state.
Automa is a browser-based automation platform that enables users to build, schedule, and execute repetitive web tasks through a visual, no-code interface. By operating as a browser extension, it provides a canvas-based environment where users construct workflows by connecting functional blocks to interact with web elements, manage browser state, and process data. The platform distinguishes itself through its deep integration with the browser environment, allowing for complex orchestration such as event-driven triggers, cross-origin request handling, and the ability to package workflows as sta
Defines and manages persistent data tables with custom columns for workflow data.
DataX is a distributed data integration framework and plugin-based ETL tool designed for synchronizing large datasets between heterogeneous sources and destinations. It functions as a JDBC data migration engine and offline synchronization tool, enabling the movement of data between relational databases, NoSQL stores, and object storage. The system utilizes a plugin-based connector architecture that decouples reader and writer logic, allowing it to map and transform data types across different storage engines using a standardized internal representation. This design supports heterogeneous data
Moves versioned data from sources to NoSQL destinations by aligning reader and writer output formats.
Presto is a distributed SQL query engine designed for high-performance analytical processing across heterogeneous data sources. It functions as a data federation platform and massively parallel processing engine, allowing users to execute interactive queries against diverse storage systems without requiring data migration. By mapping remote metadata and structures to a unified relational namespace, it enables seamless cross-platform analysis through a standard SQL interface. The engine distinguishes itself through a pluggable connector architecture and a shared-nothing distributed processing
Captures specific snapshots of data using system versions or timestamps for historical analysis.
SQLAlchemy is a comprehensive Python SQL toolkit and object-relational mapper that provides a full suite of tools for interacting with relational databases. It serves as a foundational layer for database connectivity, offering both a high-level object-oriented interface for data persistence and a programmatic SQL expression language for constructing complex, dialect-agnostic queries. The project distinguishes itself through its sophisticated unit of work persistence, which coordinates atomic transactions and tracks object state changes to minimize redundant database operations. It provides a
Executes custom logic before or after database schema objects are created or dropped to automate lifecycle management.
Databend is a cloud-native data warehouse and OLAP database designed for large-scale analytics. It functions as a SQL-compliant engine and serverless analytics platform that separates compute from storage to allow for independent scaling. The system integrates vector database capabilities, indexing high-dimensional embeddings to enable semantic, hybrid, and full-text searches across massive datasets. It further distinguishes itself through serverless compute management that automatically scales resources based on demand and shuts them down during idle periods. The platform covers a broad set
Creates snapshots and branches of production data to enable experimentation and testing without affecting primary datasets.
LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The system distinguishes itself through a combination of cloud-native object storage and immutable version tracking, allowing for data time-travel and reproducible AI experiments. It integrates hybrid search capabilities, merging dense vector similarity with BM25 full-text search and SQL-like scalar filters
Rolls back tables to specific prior versions without duplicating data to ensure reproducibility.
vxe-table is a high-performance data table component and UI library for Vue, designed for building data-heavy applications. It functions as a virtualized data grid and spreadsheet UI framework capable of rendering millions of rows by mounting only the visible elements of a dataset. The project distinguishes itself through spreadsheet-like functionality, including cell selection, copy-paste support, and the generation of cross-tabulated pivot tables. It also provides specialized tools for managing complex data hierarchies using virtual trees, row grouping, and cell merging. The library covers
Structures complex information through row grouping and cell merging to improve data readability.
Delta is a lakehouse table format that brings ACID transactions and data warehouse consistency to large scale data lakes on cloud object storage. It serves as an ACID transaction manager, coordinating atomic commits and serializable isolation for concurrent reads and writes across distributed compute engines. The project provides a multi-engine interoperability layer that uses format translation to allow diverse SQL engines and processing frameworks to read and write the same tables. It functions as a data versioning system, utilizing a transaction log to enable time travel, historical snapsh
Maintains historical snapshots of data to enable time-travel analysis and state reproduction.
Enso is a visual dataflow programming environment and multi-language data processing engine that compiles Enso, Python, Java, and JavaScript into a unified representation with a shared memory model for zero-overhead inter-language calls. It functions as a self-service data preparation and analysis platform where users can build data pipelines by connecting nodes in a graph, switching between a no-code visual interface and a code view while keeping all changes reviewable. The platform also serves as a cloud data workflow scheduler and API exposer, allowing workflows to run on a timetable or be
Automatically versions workflows, data files, and data links on every change for full state restoration.
Noms is a distributed version control database and content-addressable data store. It identifies data by cryptographic hashes to ensure integrity and deduplication, while tracking dataset state changes through a sequence of immutable commits to enable branching, forking, and historical recovery. The system functions as a peer-to-peer data synchronizer, reconciling state between disconnected database instances to ensure all nodes converge on the same data. It distinguishes itself as a schema-flexible document store that supports self-describing types, allowing schemas to evolve and widen as ne
Provides version history retrieval to inspect the chronological sequence of commits and differences for a dataset.
bup is a deduplicating backup manager and incremental backup system. It uses a Git packfile-based storage format to eliminate redundant data across files and versions, treating every incremental save as a full backup. The system provides secure remote transport interfaces for transferring and managing backup data on remote servers via SSH. It also includes a backup repository browser available as both a web interface and a filesystem mount for exploring and retrieving files from snapshots. The project covers broad capability areas including disaster recovery, repository administration, and s
Retrieves a specific previous version of backed-up data and restores it to a target directory.
pgroll is a PostgreSQL migration framework designed for zero-downtime schema changes. It applies non-blocking DDL operations that avoid exclusive locks on tables, and uses trigger-based column backfill to populate new columns while keeping them synchronized with old ones. The framework wraps each migration step in a database transaction that can be atomically committed or rolled back, and creates a versioned view layer that exposes both old and new schema versions simultaneously to client applications. The tool distinguishes itself by managing multiple schema versions via views, enabling safe
Creates a stable view layer that abstracts schema changes for querying old or new versions by name.
Accesses revision history for individual notes and reverts them to an earlier saved state.
🔥 基于大模型和 RAG 的智能问数系统,对话式数据分析神器。Text-to-SQL Generation via LLMs using RAG.
Configures and manages multiple types of data sources and tables to support flexible data access.
Pinot is a distributed, columnar analytical database designed for high-concurrency, low-latency query processing. It functions as a real-time OLAP datastore, enabling interactive, user-facing analytics by ingesting and querying massive datasets from both streaming and batch sources. The system architecture relies on a centralized controller for cluster coordination and a distributed segment-based storage model to ensure horizontal scalability. The platform distinguishes itself through a hybrid ingestion pipeline that unifies real-time event streams and historical batch data into a single quer
Reduces the number of data segments processed by pruning across all physical tables collectively to improve performance for ordered queries.
Apache Hudi is an open-source table format that brings ACID transactions, incremental processing, and multi-modal indexing to data lakes. It provides atomic commits with snapshot isolation, rollback, and optimistic concurrency control for reliable data lake operations, while supporting upserts, record-level updates, and deletions in large analytical datasets. The project distinguishes itself through a timeline-based architecture that coordinates all write operations, enabling features like time-travel querying, incremental change streaming, and multi-modal query views that include snapshot, i
Continuously schedules and orchestrates clustering, compaction, cleaning, file sizing, and indexing to maintain high performance.
CodeIgniter is a PHP web framework built on the Model-View-Controller pattern, designed for building full-stack web applications. It provides a lightweight toolkit with minimal configuration, organizing application logic into controllers, models, and views for clean separation of concerns. The framework includes a fluent query builder for constructing SQL statements programmatically, PSR-4 autoloading with namespace mapping, and a service-based dependency injection container for managing shared class instances. The framework distinguishes itself through its comprehensive set of built-in tools
Ships a command to rebuild individual database tables to reclaim unused space and improve performance.