30 open-source projects similar to numenta/nupic, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Nupic alternative.
Kats is a time series analysis framework and library providing tools for statistical characterization, anomaly detection, and trend forecasting. It functions as a toolkit for predicting future values based on historical data and identifying irregular patterns or structural change points within temporal sequences. The project includes a temporal feature extraction tool to calculate descriptive statistics and characteristics that summarize time series behavior. It also provides a system for model hyperparameter tuning using self-supervised learning to improve the scale and generalization of pre
TimesFM is a time series foundation model designed to generalize across diverse temporal datasets for forecasting and anomaly detection. It functions as a pretrained model for predicting future values in univariate time series data, eliminating the need for manual training from scratch. The project includes a framework for adapting pretrained weights to specific datasets using low-rank adaptation to improve accuracy. It also provides specialized capabilities for integrating time-series predictions as tools within autonomous AI agent architectures and automated workflows. The system supports
This PyTorch-based deep learning library provides a framework for analyzing and forecasting temporal data. It implements specialized architectures for time series forecasting, anomaly detection, data imputation, and classification. The project distinguishes itself through the inclusion of zero-shot inference capabilities, allowing large-scale temporal models to be evaluated on unseen datasets without requiring task-specific fine-tuning. The framework covers a broad range of analytical capabilities, including the recovery of missing values in incomplete datasets, the identification of irregul
statsforecast is a high-performance statistical time series forecasting library designed to generate point forecasts and prediction intervals. It functions as a distributed time series framework that utilizes a C-based forecasting engine and an automated model selector to identify and fit the optimal statistical model for every unique series in a dataset. The system also includes a time series anomaly detector to identify unusual data points by comparing observed values against probabilistic forecast intervals. The project is distinguished by its ability to handle massive-scale parallel forec
Darts is a Python time series library designed for forecasting, anomaly detection, and the preprocessing of univariate and multivariate temporal data. It serves as a comprehensive framework for training and evaluating a wide range of statistical, machine learning, and deep learning models to predict future numerical values. The toolkit is distinguished by its support for global time series modeling, allowing a single model to be trained across multiple different series to leverage shared patterns. It also features a hierarchical time series manager to ensure consistency between aggregate and
sktime is a machine learning framework for time series analysis. It provides a unified toolkit for implementing time series classification, forecasting, and anomaly detection using standardized machine learning interfaces. The library serves as a collection of tools for assigning categorical labels to temporal sequences, predicting future values based on historical patterns, and identifying outliers or unusual patterns within temporal data. The framework includes capabilities for panel-data handling and pipeline-based transformations. It utilizes a unified API wrapper and plugin-based model
sktime is a machine learning framework designed for time series analysis. It provides a unified interface for performing time series forecasting, classification, and anomaly detection, integrating these capabilities into a standardized toolkit compatible with the scikit-learn API. The framework allows for the construction of complex analysis workflows through model pipelining and ensemble-based aggregation. It uses adapter-based integration to wrap external time series libraries, providing a single entry point for diverse algorithmic implementations. Its capabilities cover temporal data tran
PyCaret is a Python AutoML platform and MLOps lifecycle manager designed to automate machine learning workflows. It functions as a low-code environment that leverages a scikit-learn native engine to execute preprocessing, training, and evaluation for tabular data. The platform distinguishes itself as an LLM-powered ML copilot, using large language model agents to analyze datasets, design experiment configurations, and explain model results. It also serves as a Kubernetes ML orchestrator and model registry, enabling the versioning of trained pipelines and their promotion to production API endp
Mmlspark is a distributed framework for executing machine learning models, data transformations, and AI service integrations across Apache Spark clusters. It functions as a distributed machine learning library and pipeline orchestrator, allowing users to integrate pre-trained cognitive services and custom models into large-scale batch and streaming workflows. The project is distinguished by its ability to incorporate external AI services and web APIs directly into big data pipelines for text and vision analysis. It provides a scalable model training framework that coordinates gradient boostin
Fastai is a high-level deep learning library built on PyTorch that provides a unified interface for managing the entire machine learning lifecycle. It functions as a comprehensive training toolkit, abstracting hardware management and automating complex training loops to simplify the construction and execution of neural network models. The framework is distinguished by its notebook-centric development environment and a type-dispatching data pipeline that automatically applies transformations based on input data formats. It emphasizes transfer learning through discriminative layer-wise optimiza
Merlion is a time series machine learning framework designed for anomaly detection and forecasting. It provides a unified interface for implementing and applying various statistical and machine learning models to temporal data streams. The project includes a benchmarking dashboard that allows for the visual testing and evaluation of models against historical ground truth datasets. This web interface enables the experimentation of different models on custom datasets without manual coding. The framework covers capabilities for identifying outliers, predicting future time series values, and mea
PaddleX is a PaddlePaddle-based framework for building, deploying, and fine-tuning AI model pipelines, with pre-built support for computer vision, OCR, document analysis, and time series tasks. It offers a toolkit of ready-to-use pipelines for image classification, object detection, segmentation, and pose estimation, alongside an end-to-end OCR document analysis pipeline that extracts text, tables, formulas, and layout information. The platform also includes a dedicated time series forecasting pipeline for analyzing historical data to detect anomalies, classify patterns, and predict future val
SynapseML is an Apache Spark machine learning library designed for building and scaling machine learning workflows and data pipelines across distributed clusters. It serves as a distributed machine learning pipeline framework and a distributed inference engine for executing hardware-accelerated predictions and deep learning tasks on large-scale datasets. The project functions as a cloud AI integration layer, allowing users to apply pretrained artificial intelligence services for text, vision, and speech within distributed pipelines. It also includes a dedicated suite of tools for distributed
Brain.js is a JavaScript neural network library for building, training, and running machine learning models in the browser or Node.js. It provides implementations for several network types, including feedforward networks, recurrent neural networks for time series forecasting, and autoencoders for data compression and denoising. The library features WebGL-based GPU acceleration to increase the speed of neural network computations on the graphics processor. It also includes a visualization tool that generates SVG images to represent the topology and layers of a feedforward network. The framewo
tsfresh is an automated feature engineering tool and library designed to extract statistical characteristics from raw time series data. It transforms sequential data into tabular datasets, converting time series into a flat format where each row represents a unique entity and columns represent extracted features. The project distinguishes itself through a parallel data processing framework that distributes heavy computational workloads across multiple CPU cores. It also implements hypothesis-based feature selection to identify the most predictive characteristics and filter out irrelevant ones
River is a Python framework for online machine learning, designed to train and evaluate models on streaming data. It enables incremental learning by updating model parameters one observation at a time, eliminating the need to store full training datasets in memory. The library distinguishes itself through a dedicated concept drift detection system that monitors changes in data distributions to trigger model adaptation. It also provides a progressive validation framework that simulates real-time deployment by testing models on samples before using them for training. The system covers a broad
GluonTS is a probabilistic time series library and deep learning forecasting framework. It provides a toolkit for building, training, and evaluating neural network architectures that predict future values as probability distributions to quantify uncertainty. The project distinguishes itself by supporting zero-shot forecasting and integrating diverse modeling approaches, including deep probabilistic neural networks and wrappers for external statistical libraries such as Prophet and R forecast. It implements specialized architectural primitives like causal convolutions and invertible residual n
AutoGluon is an automated machine learning framework designed to optimize model selection and hyperparameter tuning across tabular, text, image, and time series data. It functions as an ensemble learning library and a tabular data prediction engine, aiming to build high-accuracy predictive models without manual algorithm selection. The framework integrates multimodal machine learning pipelines that combine disparate data types into a single representation using specialized encoders. It also includes a probabilistic time series forecaster that fits multiple statistical and deep learning models
PyTorch Forecasting is a deep learning framework designed for building and training neural network architectures specifically for time series forecasting. It serves as a comprehensive toolkit for implementing autoregressive models, multi-horizon forecasting, and probabilistic prediction intervals using PyTorch tensors. The library distinguishes itself through a probabilistic forecasting toolkit that generates prediction intervals and quantile forecasts using both parametric and non-parametric distributions. It further provides a neural network model optimizer for automated hyperparameter tuni
Smile is a comprehensive JVM machine learning library and statistical computing toolkit. It provides a suite of algorithms for classification, regression, and clustering, implemented natively for Java, Scala, and Kotlin. The project also functions as a deep learning framework, a natural language processing library, and an inference engine for large language models. The library distinguishes itself through GPU acceleration via LibTorch bindings and support for the ONNX model interchange format. It includes specialized capabilities for large language model inference, featuring Byte-Pair Encodin
2-2000x faster ML algos, 50% less memory usage, works on all hardware - new and old.
A small machine learning library written in Lisp (Clojure) aimed at providing simple, concise implementations of machine learning techniques and utilities.
Caffe is a high-performance deep learning framework designed for training and deploying deep neural networks. It functions as a machine learning engine and a convolutional neural network library, providing a C++ backend to accelerate computations on both GPUs and CPUs. The system includes a specialized toolset for computer vision, enabling tasks such as object detection, semantic segmentation, and large-scale image retrieval. It supports the deployment of pre-trained models for image and scene recognition, as well as the ability to fine-tune neural network weights for specialized tasks. The
🔥 Cogitare - A Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python
CatBoost is a gradient boosting machine learning library used to train decision tree ensembles for regression, classification, and ranking tasks. It functions as a high-performance framework that provides a categorical data processor for transforming non-numeric features, a distributed trainer for large-scale datasets, and GPU acceleration to speed up model construction. The library distinguishes itself through native handling of categorical data and text features, removing the need for manual encoding. It includes a specialized model interpretability tool that leverages SHAP values and featu