1 repositorio
Creation of reproducible data slices using metadata queries without duplicating physical files.
Distinct from Data Versioning: Distinct from general Data Versioning as it focuses on defining specific subsets via metadata rather than full historical snapshots.
Explore 1 awesome GitHub repository matching data & databases · Versioned Data Subsetting. Refine with filters or upvote what's useful.
ClearML is a comprehensive MLOps platform designed to manage the end-to-end machine learning lifecycle, from initial experimentation to production deployment. It provides a suite of integrated tools including a pipeline orchestrator for automating workflows, an experiment tracking tool for logging hyperparameters and metrics, and a metadata-driven data versioning system for managing large-scale datasets and model artifacts. The platform is distinguished by its advanced compute management and serving capabilities. It features a GPU compute manager that supports fractional resource slicing and
Provides a metadata-driven method for defining specific data slices to ensure experiment reproducibility without duplicating storage.