5 个仓库
Tracking dataset state changes through a sequence of immutable snapshots to allow branching and auditing.
Distinct from Commit Index Tracking: Existing candidates focus on vulnerability matching, semantic versioning, or issue tracking, not the data-level commit history of a database.
Explore 5 awesome GitHub repositories matching data & databases · Immutable Commit Versioning. Refine with filters or upvote what's useful.
Noms is a distributed version control database and content-addressable data store. It identifies data by cryptographic hashes to ensure integrity and deduplication, while tracking dataset state changes through a sequence of immutable commits to enable branching, forking, and historical recovery. The system functions as a peer-to-peer data synchronizer, reconciling state between disconnected database instances to ensure all nodes converge on the same data. It distinguishes itself as a schema-flexible document store that supports self-describing types, allowing schemas to evolve and widen as ne
Tracks state changes using a progression of immutable commit structures to enable branching and merging.
Pachyderm is a containerized, versioned, and lineage-tracked data pipeline platform that runs natively on Kubernetes. It combines a distributed file system backend with immutable data versioning, so every commit to a data repository creates an auditable snapshot, and every pipeline step executes as an isolated container. The platform is defined by a data-centric pipeline model where pipelines are specified by their input and output data repositories rather than explicit task sequences, and provenance is recorded as a directed acyclic graph of commits linking output data to its input sources an
Every commit to a data repository creates an immutable snapshot, enabling full reproducibility and lineage tracking.
lakeFS 是一个数据湖版本控制系统,为存储在对象存储中的大型数据集提供类似 Git 的分支和提交功能。它作为一个版本控制层,支持创建不可变快照、原子提交和零拷贝分支,从而在不复制物理文件的情况下为数据实验创建隔离环境。 该系统充当 S3 兼容的存储网关和 Iceberg REST 目录,允许标准云存储协议和兼容客户端管理版本化表。它通过使用事件驱动的钩子系统在更改合并到生产环境之前根据治理策略验证数据集,从而充当数据质量守门人。 该平台涵盖了广泛的数据治理功能,包括 Pull Request 协作、基于角色的访问控制和数据血缘追踪。它为工作流编排、机器学习管线和各种大数据计算引擎提供了集成,支持多云存储连接以及通过 SSO 和 SCIM 进行身份同步。 该软件可以使用二进制文件、容器或 Helm Chart 安装,以便在 Kubernetes 上部署。
Captures the current state of a branch as a unique, immutable commit for auditing and reproducibility.
This project is a Chinese localization repository and technical translation project designed to make concise programming projects and technical documentation accessible to Chinese speakers. It provides a collection of translated resources and curated mappings of computer science terminology to ensure consistent translation of technical concepts. The project implements a software localization workflow that converts English-language technical guides and codebase documentation into Chinese. This process utilizes a technical glossary resource and a resource-driven localization model to maintain t
Maintains data history and entity lifecycles through a sequence of immutable version layers.
nit 是一个区块链资产溯源平台和去中心化资产注册表。它通过为文件分配加密唯一标识符,并在账本上记录其来源、所有权和修改历史,从而建立数字媒体的可验证监管链。 该项目通过集成 IPFS 进行去中心化存储,以及一个通过不可变提交跟踪资产演变的版本控制系统而脱颖而出。它包含用于生成式 AI 溯源跟踪的专业工具,允许记录合成媒体中使用的创作者和工具,以维护透明的元数据树。 系统涵盖了广泛的功能,包括数字版权管理、通过智能合约进行的自动版税分配以及内容真实性验证。它还实现了代币加权治理模型,用户可以质押代币,通过去中心化投票影响协议方向。 该项目使用 TypeScript 开发。
Tracks changes to digital assets through a sequence of immutable commits on IPFS.