28 open-source projects similar to featureform/featureform, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Featureform alternative.
Feast is an open-source feature store for machine learning that provides a central platform for defining, storing, and serving features across both training and inference workflows. It operates as a declarative system where feature definitions are written as code in Python files, synchronized to a central registry, and made available for low-latency online retrieval or point-in-time correct historical joins for training datasets. The project abstracts storage behind a pluggable architecture, allowing offline and online backends to be swapped without changing retrieval logic, and coordinates ma
This project is a comprehensive pandas data analysis tutorial and instructional guide designed for learning data manipulation and analysis. It serves as a tabular data processing guide and a manual for time series analysis, providing a structured approach to cleaning, merging, and transforming datasets. The repository functions as a data feature engineering course, providing tutorials on constructing and selecting dataset features to improve machine learning model performance. It also includes a vectorized data operations guide for performing element-wise mathematical computations and matrix
This is an interactive notebook-based course that teaches machine learning from Python fundamentals through deep learning and natural language processing. It uses real datasets and multiple frameworks within a structured, hands-on curriculum that combines concise explanations with executable code cells, built-in datasets, and embedded exercise checkpoints. Learning progresses through data preparation and exploration, classical machine learning workflows, computer vision with convolutional neural networks, and natural language processing with deep learning, all delivered as a cohesive progressi
tsfresh is an automated feature engineering tool and library designed to extract statistical characteristics from raw time series data. It transforms sequential data into tabular datasets, converting time series into a flat format where each row represents a unique entity and columns represent extracted features. The project distinguishes itself through a parallel data processing framework that distributes heavy computational workloads across multiple CPU cores. It also implements hypothesis-based feature selection to identify the most predictive characteristics and filter out irrelevant ones
A collection of pandas & scikit-learn compatible transformers for preprocessing and feature engineering 🛠
scikit-learn addon to operate on set/"group"-based features
A sklearn-compatible Python implementation of Multifactor Dimensionality Reduction (MDR) for feature construction.
A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.
Featuretools is a Python data science library and automated feature engineering framework designed to create predictive features from multiple related datasets. It automates the data preparation and transformation steps required for machine learning models through deep feature synthesis. The library enables the automatic generation of comprehensive feature tables by applying recursive transformations to relational data. It supports the transformation of unstructured text into structured numeric features and allows users to define custom primitives to extend the synthesis process with specific
zoofs is a python library for performing feature selection using a variety of nature-inspired wrapper algorithms. The algorithms range from swarm-intelligence to physics-based to Evolutionary. It's easy to use , flexible and powerful tool to reduce your feature size.
open-source feature selection repository in python
python binding for libvips using cffi
Feathr – A scalable, unified data and AI engineering platform for enterprise
Hopsworks - Data-Intensive AI platform with a Feature Store
A set of tools for creating and testing machine learning features, with a scikit-learn compatible API
NitroFE is a Python feature engineering engine which provides a variety of modules designed to internally save past dependent values for providing continuous calculation.
scikit-image is a Python image processing library and scientific image analysis toolkit. It provides a framework for digital image processing and computer vision, utilizing numerical arrays for pixel-level manipulations. The library enables the quantification of image properties and the detection of visual features, such as edges and blobs. It includes tools for image segmentation and the extraction of textures and patterns to characterize objects within visual data. Capabilities cover image manipulation through color space conversion, geometric transformations, and digital restoration. It a
Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs
A powerful tool for creating high-quality training datasets for Large Language Models (LLMs)(一个快速生成高质量LLM微调训练数据集的工具)
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Python package implementing ML feature engineering and pre-processing for polars or pandas dataframes.