2 مستودعات
Tools for cleaning, transforming, and preparing raw data for machine learning pipelines.
Distinguishing note: Focuses on automated feature engineering for tabular data, distinct from generic ETL tools.
Explore 2 awesome GitHub repositories matching data & databases · Data Preprocessing Utilities. Refine with filters or upvote what's useful.
This project is an educational platform and research toolkit designed to teach deep learning through a combination of mathematical theory, visual diagrams, and executable code. It provides a comprehensive environment for building, training, and evaluating neural networks, grounding complex concepts in interactive computational notebooks that allow for hands-on experimentation. The framework distinguishes itself by interleaving theoretical foundations—including linear algebra, calculus, and probability—with practical implementations across multiple industry-standard libraries. It supports flex
Standardizes numerical features and encodes categorical variables for tabular data processing.
This project is an agnostic model interpretability framework and explainability tool designed to provide local interpretable explanations for individual predictions. It functions as a local surrogate model that approximates the behavior of any machine learning classifier or regression model to identify the most influential features for a specific instance. The framework is designed to be model-agnostic, meaning it can explain predictions across tabular, text, and image data regardless of the underlying architecture. It employs local linear approximations and feature importance visualization t
Provides preprocessing utilities that combine discretization for continuous variables and sampling for categorical features.