1 repo
Utilities and methods for collecting, cleaning, and curating data samples for machine learning model training.
Distinguishing note: The shortlist was empty; this category is required to house data-gathering workflows for ML pipelines.
Explore 1 awesome GitHub repository matching artificial intelligence & ml · Dataset Preparation Tools. Refine with filters or upvote what's useful.
This project is an interactive data science environment that combines code execution, rich media visualization, and narrative documentation into a persistent, browser-based platform. It serves as a comprehensive educational resource for scientific computing, providing a framework for iterative data analysis and machine learning prototyping. The environment is distinguished by its focus on high-performance numerical computing, utilizing vectorized array operations and memory-mapped data structures to handle large-scale computations efficiently. It features a unified estimator interface that st
Acquire a collection of background data examples that do not contain the target features for model training.