30 open-source projects similar to rhiever/data-analysis-and-machine-learning-projects, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Data Analysis And Machine Learning Projects alternative.
TPOT is a Python automated machine learning tool and pipeline framework. It automatically searches, selects, and tunes machine learning algorithms and hyperparameters to identify the most effective model architecture. The system utilizes genetic programming to optimize these pipelines through evolutionary algorithms. To accelerate the search process, it functions as a multi-core evaluator that runs parallel training workflows across multiple processor cores. The framework supports the definition of custom objective functions to optimize pipelines based on specific performance metrics.
TensorFlow is a comprehensive machine learning framework designed for the construction, training, and deployment of complex mathematical models. It utilizes a graph-based execution model that represents operations as directed acyclic graphs, enabling automatic differentiation and efficient parallel processing. The system provides high-level interfaces for defining neural network architectures, alongside a robust engine for managing multidimensional array structures and tensor mathematics. The framework distinguishes itself through a scalable distributed runtime that orchestrates workloads acr
Redis is a high-performance in-memory key-value store that functions as a distributed cache, message broker, and NoSQL database. It provides sub-millisecond read and write access to data stored in RAM and can operate as a vector database for indexing high-dimensional embeddings. The system supports a wide range of data storage and synchronization primitives, including the management of strings, hashes, lists, sets, and JSON documents. It enables real-time data operations through atomic transactions, hybrid persistence using snapshots and append-only logs, and high-availability configurations
h2oGPT is a self-hosted platform designed for running large language models and executing retrieval-augmented generation workflows locally. It provides a comprehensive web interface that allows users to index private document collections into searchable databases, enabling context-aware question answering and summarization without exposing sensitive data to external services. The platform distinguishes itself by offering a modular architecture that supports both local model execution and connections to external inference servers. It facilitates the development of autonomous agents capable of
This project is an academic curriculum repository and educational resource center for studying probability, statistics, and machine learning. It serves as a deep learning course website and a hub for instructional materials, providing a structured collection of content designed to teach neural network architectures. The repository distinguishes itself by combining a comprehensive educational resource with a machine learning project archive. It provides a curated set of research examples and implementation guides for a wide range of models, including multilayer perceptrons, convolutional netwo
Evolve is an evolution-based organism designer and GPU-accelerated artificial life simulator that combines interactive particle physics with a real-time simulation editor. At its core, it runs genetic algorithm evolution on self-replicating graph structures to evolve digital organisms, offloading particle physics, neural networks, and rendering entirely to the GPU through a compute shader pipeline for real-time performance. The project distinguishes itself with graph-based organism design that uses a directed graph editor to visually define organism structure, connections, and neural controll
This is a Python automated machine learning framework designed to automate the design and optimization of machine learning pipelines. It functions as a genetic programming pipeline optimizer and an automated feature selection tool, using evolutionary search to discover the most effective sequences of data processing and model steps. The project focuses on multi-objective optimization to balance competing performance metrics simultaneously. It employs a genetic selection process to identify impactful variables and remove noise from raw datasets, ensuring the resulting machine learning solution
Geolib is a geospatial calculation library and point analysis tool. It provides a collection of utilities for computing distances, bearings, and areas between coordinates, as well as converting geographic measurements and coordinate formats. The library features a Well-Known Text geometry parser to convert WKT strings into coordinate structures for polygon analysis. It includes specialized tools for geofencing and point containment, enabling the determination of whether a coordinate falls within a defined polygon or a specified radius. The toolset covers broad capability areas including loca
scikit-opt is a Python optimization library and numerical framework designed to solve complex global optimization problems. It provides a suite of metaheuristic algorithms and tools for finding global minima or maxima of objective functions. The library implements a variety of nature-inspired and swarm intelligence algorithms, including Genetic Algorithms, Particle Swarm Optimization, Differential Evolution, Simulated Annealing, and Ant Colony Optimization. It includes specialized solvers for discrete combinatorial challenges, such as the Traveling Salesman Problem. The framework supports th
Zenbot is an automated cryptocurrency trading bot designed to execute trades on exchanges based on technical analysis and predefined risk parameters. It functions as a technical analysis engine that processes market data through mathematical indicators to generate actionable trade signals. The system includes a genetic algorithm strategy optimizer to automatically discover the most profitable parameter configurations. It provides multiple simulation environments, including a trading strategy backtester for replaying historical data and a paper trading simulator for testing strategies against
This project is a neural network route optimizer and unsupervised learning tool designed to solve the traveling salesman problem. It functions as a self-organizing map solver that calculates near-optimal paths through a set of coordinates to determine the shortest possible tour. The system utilizes a Kohonen map implementation to organize high-dimensional data into a lower-dimensional representation. It employs competitive learning and topology preservation to approximate solutions for combinatorial optimization problems. The solver covers route optimization analysis and heuristic pathfindin
This is a Ruby toolkit for converting addresses to geographic coordinates and performing reverse geocoding via multiple external API providers. It provides a library for integrating location services into Ruby applications, including an IP geolocation tool to translate IP addresses into coordinates, city names, and country data. The project includes a command line interface for bulk geocoding of database records with integrated rate limiting and a geospatial query engine for calculating distances and performing radius or bounding box searches. It also features a mocking framework that provide
EvoAgentX is an agent platform that combines human-in-the-loop checkpoints, MCP tool integration, multi-agent workflow orchestration, and self-improvement capabilities. It functions as a self-improving agent framework that connects to MCP-compatible servers and orchestrates multi-agent workflows using natural-language goals, while also serving as a platform that discovers, configures, and manages tools from MCP servers for use in automated agent workflows. The platform distinguishes itself through a dual-memory agent architecture that maintains short-term and persistent memory stores, enablin
geopy is a Python geocoding library and geolocation client used to convert human-readable addresses into geographic coordinates and resolve coordinates back into street addresses using various third-party web services. The library provides a consistent provider-based interface that abstracts multiple external geocoding services, allowing for interchangeable backends. It includes built-in request rate limiting and asynchronous client interfaces to manage API call frequency and execute concurrent lookups without halting execution. Beyond geocoding, the project includes geospatial utilities for
blessed-contrib is a terminal user interface framework and a Node.js console widget library designed for building data-driven command line interfaces. It serves as an ASCII data visualization toolkit and a dashboard framework for organizing grid-based layouts and interactive elements within a console. The project provides a collection of reusable terminal components, including a command line image renderer and tools for text-based graphic rendering. It specifically enables the creation of terminal dashboards through a system for positioning multiple widgets across rows and columns and a mecha
NNI is an AutoML toolkit designed to automate machine learning lifecycles. It functions as a hyperparameter optimization framework, a neural architecture search tool, and a model compression suite. The project provides a distributed training orchestrator to manage machine learning workloads across local machines, remote servers, and cloud platforms. It enables the discovery of efficient model structures through reinforcement learning and one-shot optimization methods, while utilizing Bayesian and evolutionary algorithms to automate hyperparameter tuning. Additional capabilities include tools
This project is a comprehensive collection of common computer science algorithms and data structures implemented in Swift. It serves as an educational reference and library for studying computational complexity, algorithmic logic, and data structure engineering through practical code examples. The repository provides a wide suite of data structure implementations, including various types of linked lists, heaps, hash tables, and an extensive range of hierarchical trees such as Red-Black, B-Tree, and Splay trees. It also covers diverse sorting and searching techniques, from basic bubble sort to
Smile is a comprehensive JVM machine learning library and statistical computing toolkit. It provides a suite of algorithms for classification, regression, and clustering, implemented natively for Java, Scala, and Kotlin. The project also functions as a deep learning framework, a natural language processing library, and an inference engine for large language models. The library distinguishes itself through GPU acceleration via LibTorch bindings and support for the ONNX model interchange format. It includes specialized capabilities for large language model inference, featuring Byte-Pair Encodin
Evilcharts is a data visualization library and animated charting framework designed to render interactive data graphics. It functions as a responsive data graphics engine that transforms raw data sets into visual formats such as bars, lines, and pies. The project focuses on interactive data visualization by incorporating motion triggers and visual transitions. It provides tools for custom chart styling to align visual effects and colors with specific brand guidelines and design requirements. The engine manages responsive data dashboards through automatic layout scaling to maintain consistenc
AutoGluon is an automated machine learning framework designed to optimize model selection and hyperparameter tuning across tabular, text, image, and time series data. It functions as an ensemble learning library and a tabular data prediction engine, aiming to build high-accuracy predictive models without manual algorithm selection. The framework integrates multimodal machine learning pipelines that combine disparate data types into a single representation using specialized encoders. It also includes a probabilistic time series forecaster that fits multiple statistical and deep learning models
This project is a collection of reinforcement learning implementations and educational materials written in Python. It provides neural network architectures for solving control tasks through deep reinforcement learning, spanning value-based and policy-gradient methods. The repository includes a library of evolutionary strategies and genetic algorithms as alternatives to gradient-based learning. It also features a model-based system for predicting future environment states and rewards to enable internal simulation and offline planning. The codebase covers a wide range of capabilities, includi
This repository serves as a centralized collection of state-of-the-art deep learning architectures and reference implementations designed for research and application development. It provides a comprehensive toolkit for computer vision and natural language processing, offering pre-built models and training pipelines for tasks ranging from image classification and object detection to complex sequence modeling. The project distinguishes itself by providing a flexible execution harness that manages the entire training lifecycle, including data ingestion and backpropagation. It supports scalable
Openface is a deep learning toolkit designed for facial recognition and identity verification. It provides a comprehensive pipeline for detecting faces, aligning landmarks, and transforming facial images into compact numerical vectors. By utilizing these embeddings, the system enables identity classification and similarity comparison through geometric distance calculations. The project distinguishes itself by integrating research-oriented diagnostic tools alongside its core recognition capabilities. It includes utilities for visualizing high-dimensional feature clusters, inspecting internal c
Magenta is an AI creative suite and TensorFlow generative art framework used to train and deploy models for the production of artistic media. It functions as a generative music library and a deep learning art generator, providing tools to automate the creation of original musical compositions and visual artwork. The project covers AI music composition and generative visual art through neural art generation and machine learning creativity. It enables the training of generative models to produce original songs, images, and drawings based on learned patterns.
This project is a community-maintained directory that serves as a comprehensive index of software tools, frameworks, and educational materials. It functions as an open-source knowledge base, organizing diverse engineering domains and technical resources into a structured taxonomy to assist developers in discovering high-quality content. The directory distinguishes itself through a decentralized peer-review model, where independent contributors curate, verify, and update entries to ensure accuracy and relevance. All information is stored in a version-controlled, flat-file markdown format, whic
Charts is a mobile data visualization library designed for rendering interactive graphical representations of complex datasets. It provides a declarative configuration interface that maps data structures to visual components, supporting a variety of chart types including line, bar, pie, scatter, and radar plots. The library distinguishes itself through a hardware-accelerated drawing layer that ensures high-performance rendering across mobile platforms. It features a gesture-driven transformation engine that enables users to pan, zoom, and scale views, alongside an interpolated animation syste
ChartGPU is a high-performance visualization library designed to render large-scale datasets and real-time data streams using hardware acceleration. It functions as a component-based tool that integrates into declarative user interfaces, allowing developers to build responsive, themeable charts that maintain smooth interaction even when processing massive amounts of information. The library distinguishes itself through a specialized rendering engine that employs screen-space binning and zoom-aware data resampling to manage dense datasets. It provides advanced interactive capabilities, includi
This project is an automated machine learning framework and toolkit designed for training and tuning custom models for classification, regression, and recommendations. It functions as a multimodal machine learning toolkit capable of processing and training models using a combination of text, image, audio, and sensor data. The framework distinguishes itself as a multimodal data processor that can handle and visualize large datasets on a single machine using column-oriented disk storage. It includes a core machine learning model generator that converts trained models into formats compatible wit
Gentelella is a collection of pre-configured interface templates and a component library designed for building administration panels, data dashboards, and internal management consoles. It provides a Bootstrap 5 based framework that includes accessible web interface templates and PWA-ready dashboard shells. The project features specialized templates for data visualization, utilizing modular chart factories to render line, bar, radar, and heatmap visualizations. It includes a set of ready-to-use interface elements for enterprise prototyping, such as kanban boards, file managers, and interactive