Why is openai/gym a recommended Data Discretization GitHub Repositories repository?

Translates high-level agent decisions into specific numerical values compatible with underlying physics or logic engines.

Why is oi-wiki/oi-wiki a recommended Data Discretization GitHub Repositories repository?

Maps values to sorted ranks to facilitate efficient algorithmic processing.

Why is nlp-love/ml-nlp a recommended Data Discretization GitHub Repositories repository?

Converts continuous variables into discrete bins to improve model robustness against outliers and introduce non-linearity.

Why is lllyasviel/framepack a recommended Data Discretization GitHub Repositories repository?

Discretizes historical data into tokens to align training distributions with inference patterns.

Why is mathfoundationrl/book-mathematical-foundation-of-reinforcement-learning a recommended Data Discretization GitHub Repositories repository?

Maps continuous environment coordinates to numerical indices to simplify transition probability calculations.

Why is marcotcr/lime a recommended Data Discretization GitHub Repositories repository?

Provides utilities to convert continuous numerical variables into discrete bins to simplify feature influence explanations.

Why is facebookresearch/seamless_communication a recommended Data Discretization GitHub Repositories repository?

Transforms continuous audio waveforms into sequences of discrete units for efficient model processing.

Why is sjwhitworth/golearn a recommended Data Discretization GitHub Repositories repository?

Processes discrete data by merging histograms and combining data points to prepare datasets for predictive modeling.

Why is iamseancheney/python_for_data_analysis_2nd_chinese_version a recommended Data Discretization GitHub Repositories repository?

Converts continuous numerical features into discrete bins or quartiles for distribution analysis.

Why is saulpw/visidata a recommended Data Discretization GitHub Repositories repository?

Groups numeric values into calculated ranges to create histograms and visualize distribution.

22 repository-uri

Awesome GitHub RepositoriesData Discretization

Maps continuous or large-range values to sorted ranks for efficient processing.

Distinct from Data Sorting Engines: Distinct from Data Sorting Engines: focuses on rank-mapping for algorithmic efficiency rather than general dataset ordering.

Explore 22 awesome GitHub repositories matching scientific & mathematical computing · Data Discretization. Refine with filters or upvote what's useful.

Găsește cele mai bune repo-uri cu AI.Vom căuta cele mai potrivite repository-uri folosind AI.

openai/gym
openai/gym
37,223Vezi pe GitHub
Gym is a reinforcement learning environment toolkit and agent simulation framework. It provides a standardized API and a universal communication interface that defines how learning agents interact with simulation environments through actions and observations. The project includes a benchmark environment suite and a diverse library of pre-configured simulation worlds, including physics engines and classic control tasks. It enables the creation of custom simulation environments to train agents in specific operational scenarios while ensuring reproducibility across different learning algorithms.
Translates high-level agent decisions into specific numerical values compatible with underlying physics or logic engines.
Python
Vezi pe GitHub37,223
oi-wiki/oi-wiki
OI-wiki/OI-wiki
26,176Vezi pe GitHub
This project is a comprehensive, community-maintained knowledge base and toolkit designed for competitive programming. It serves as a centralized repository for algorithmic theory, data structures, and mathematical techniques, providing a structured reference for informatics and collegiate programming competitions. The project distinguishes itself by integrating educational content with a robust suite of automation utilities. It provides a complete workflow for competitive programming, including tools for automated test case generation, solution verification, and direct interaction with onlin
Maps values to sorted ranks to facilitate efficient algorithmic processing.
TypeScriptacm-icpcacm-icpc-handbookalgorithms
Vezi pe GitHub26,176
nlp-love/ml-nlp
NLP-LOVE/ML-NLP
17,725Vezi pe GitHub
This project is a machine learning algorithm reference and implementation guide that provides theoretical foundations and code for supervised learning, deep learning, and natural language processing. It serves as a comprehensive toolkit for implementing predictive models and a technical reference for algorithm engineering. The project focuses on ensemble learning frameworks, including the construction of decision trees, random forests, and gradient boosting models. It also functions as a probabilistic graphical model library and an NLP algorithm reference, with specific implementations for se
Converts continuous variables into discrete bins to improve model robustness against outliers and introduce non-linearity.
Jupyter Notebookdeep-learningmachine-learningnlp
Vezi pe GitHub17,725
lllyasviel/framepack
lllyasviel/FramePack
17,028Vezi pe GitHub
FramePack is a neural video synthesis engine and generation framework designed to produce long, temporally consistent video sequences. It functions as a diffusion model optimizer, providing a suite of techniques to manage the computational demands of high-parameter video models while maintaining visual stability during extended generation tasks. The system distinguishes itself through a hierarchical approach to frame prediction, which plans distant anchor frames before filling in intermediate content to prevent cumulative temporal drift. By utilizing constant-length context compression and to
Discretizes historical data into tokens to align training distributions with inference patterns.
Python
Vezi pe GitHub17,028
mathfoundationrl/book-mathematical-foundation-of-reinforcement-learning
MathFoundationRL/Book-Mathematical-Foundation-of-Reinforcement-Learning
16,543Vezi pe GitHub
This project is an educational resource designed to teach the mathematical foundations and core algorithms of reinforcement learning. It provides a structured academic curriculum that combines textbooks, lecture materials, and practical code examples to guide learners through the principles of Markov decision processes and reinforcement learning theory. The repository distinguishes itself by integrating a grid-based simulation framework that allows users to test algorithms within custom environments. This environment supports the analysis of agent performance by rendering state values, polici
Maps continuous environment coordinates to numerical indices to simplify transition probability calculations.
MATLABartificial-intelligencebookcourses
Vezi pe GitHub16,543
marcotcr/lime
marcotcr/lime
12,142Vezi pe GitHub
This project is an agnostic model interpretability framework and explainability tool designed to provide local interpretable explanations for individual predictions. It functions as a local surrogate model that approximates the behavior of any machine learning classifier or regression model to identify the most influential features for a specific instance. The framework is designed to be model-agnostic, meaning it can explain predictions across tabular, text, and image data regardless of the underlying architecture. It employs local linear approximations and feature importance visualization t
Provides utilities to convert continuous numerical variables into discrete bins to simplify feature influence explanations.
JavaScript
Vezi pe GitHub12,142
facebookresearch/seamless_communication
facebookresearch/seamless_communication
11,797Vezi pe GitHub
This project is a multimodal translation framework and large language model capable of speech-to-speech, speech-to-text, and text-to-text translation across nearly 100 languages. It provides a real-time speech translation engine and a comprehensive toolkit for converting spoken audio between languages. The system is distinguished by its ability to preserve the original speaker's tone, pace, and prosody during translation. It utilizes a specialized on-device inference toolkit that converts model checkpoints into C-based libraries, enabling low-latency execution on mobile and edge hardware with
Transforms continuous audio waveforms into sequences of discrete units for efficient model processing.
Jupyter Notebook
Vezi pe GitHub11,797
sjwhitworth/golearn
sjwhitworth/golearn
9,438Vezi pe GitHub
GoLearn is a machine learning library for the Go programming language. It provides a supervised learning framework and a toolkit for building, training, and evaluating predictive models through a standardized interface. The project implements a data frame system that loads CSV files into structured grids for matrix operations. It includes a preprocessing library for discretizing continuous variables and a model evaluation toolkit that utilizes confusion matrices and cross-validation to measure precision and recall. The library covers data engineering and management, including the ability to
Processes discrete data by merging histograms and combining data points to prepare datasets for predictive modeling.
Go
Vezi pe GitHub9,438
iamseancheney/python_for_data_analysis_2nd_chinese_version
iamseancheney/python_for_data_analysis_2nd_chinese_version
8,937Vezi pe GitHub
This project is an educational resource and a collection of instructional materials for performing data manipulation and statistical analysis using Python. It provides a comprehensive set of guides and code examples for using the Pandas, NumPy, and Matplotlib libraries to analyze structured data. The resource includes a dedicated guide for reshaping, cleaning, and aggregating tabular data and time series via Pandas, alongside a reference for high-performance vectorized operations and linear algebra using NumPy. It also features tutorials for creating publication-quality charts, distribution p
Converts continuous numerical features into discrete bins or quartiles for distribution analysis.
matplotlibnumpypandas
Vezi pe GitHub8,937
saulpw/visidata
saulpw/visidata
8,834Vezi pe GitHub
VisiData is a terminal-based interactive data analysis tool and browser designed for exploring, filtering, and sorting large tabular datasets. It functions as a structured data inspector that loads and flattens complex formats like JSON, XML, and PCAP into interactive sheets, as well as a terminal file manager for navigating directories and performing staged filesystem operations. The project distinguishes itself by rendering data visualizations, such as scatter plots and histograms, directly in the terminal using Unicode Braille characters. It provides a Python-based data wrangling environme
Groups numeric values into calculated ranges to create histograms and visualize distribution.
Pythonclicsvdatajournalism
Vezi pe GitHub8,834
jackzhenguo/python-small-examples
jackzhenguo/python-small-examples
8,132Vezi pe GitHub
This project is a comprehensive library of practical Python code examples and patterns. It provides a collection of scripts and snippets designed to demonstrate a wide range of programming tasks, from basic syntax to advanced implementation patterns. The repository focuses on several core domains, including the implementation of concurrency and multithreading examples, data analysis snippets for cleaning and manipulating tabular data, and various data visualization examples. It also covers automation scripts for file system management and a variety of general programming patterns. Additional
Implements grouping of continuous numeric values into discrete categories based on range boundaries.
Pythondata-sciencemachine-learningpython
Vezi pe GitHub8,132
lllyasviel/omost
lllyasviel/Omost
7,613Vezi pe GitHub
Omost is a system of software components designed for iterative image refinement, regional layout control, and the optimization of text-to-image embedding processes. It functions as a diffusion model layout controller and an engine that uses large language models to generate executable code for precise control over image composition. The project features a conversational image editor that allows for the refinement of visual content through natural language instructions and automated code execution. It distinguishes itself through a text embedding optimizer that organizes sub-prompts into tree
Provides a grid-based coordinate system to map global and local descriptions to specific image areas.
Python
Vezi pe GitHub7,613
biolab/orange3
biolab/orange3
5,635Vezi pe GitHub
Orange3 is a visual data mining platform that provides an interactive canvas for building data analysis workflows without writing code. At its core, it offers a widget-based visual programming environment where users connect configurable components to perform data preprocessing, machine learning model training, statistical evaluation, and interactive visualization. The platform is built on NumPy-backed data tables with domain descriptors that define variable names, types, and roles, and includes a lazy SQL query proxy for working with database tables without loading all data into memory. The
Ships a widget that converts continuous numeric attributes into categorical bins using partitioning strategies.
Python
Vezi pe GitHub5,635
udacity/deep-reinforcement-learning
udacity/deep-reinforcement-learning
5,169Vezi pe GitHub
Acest proiect este un curriculum de deep reinforcement learning care oferă materiale educaționale și exerciții de implementare pentru stăpânirea agenților bazați pe rețele neuronale. Acesta servește drept framework pentru construirea versiunilor de referință ale metodelor bazate pe valoare și pe politică pentru a rezolva probleme de decizie secvențială. Proiectul oferă implementări specifice pentru simulări de control continuu și reinforcement learning multi-agent, unde agenții sunt antrenați să coopereze sau să concureze în medii partajate. Include un framework de gradient de politică pentru optimizarea comportamentului agentului prin metode precum REINFORCE. Capabilitățile acoperă o gamă largă de algoritmi de optimizare, inclusiv deep Q-learning, gradienți de politică deterministă și programare dinamică pentru modelarea proceselor de decizie Markov. Sistemul suportă diverse domenii de antrenament, cum ar fi navigația robotică, automatizarea tranzacțiilor financiare și simulările bazate pe fizică. Materialele sunt livrate sub forma unei serii de Jupyter Notebooks.
Provides methods for mapping continuous environment coordinates to discrete numerical indices for state-space modeling.
Jupyter Notebookcross-entropyddpgdeep-reinforcement-learning
Vezi pe GitHub5,169
datawhalechina/joyful-pandas
datawhalechina/joyful-pandas
5,164Vezi pe GitHub
Acest proiect este un tutorial cuprinzător de analiză a datelor pandas și un ghid instrucțional conceput pentru învățarea manipulării și analizei datelor. Acesta servește drept ghid de procesare a datelor tabelare și un manual pentru analiza seriilor temporale, oferind o abordare structurată pentru curățarea, fuziunea și transformarea seturilor de date. Repository-ul funcționează ca un curs de feature engineering pentru date, oferind tutoriale despre construirea și selectarea caracteristicilor setului de date pentru a îmbunătăți performanța modelului de machine learning. Include, de asemenea, un ghid de operațiuni vectorizate pe date pentru efectuarea de calcule matematice element-cu-element și manipulări de matrice. Materialul acoperă o gamă largă de capabilități, inclusiv fluxuri de lucru de curățare a datelor, sarcini de integrare a datelor și analiză a datelor tabelare. Oferă îndrumări privind procesarea informațiilor textuale, gestionarea datelor categorice și optimizarea vitezei de execuție pentru seturi de date mari. Proiectul este livrat sub forma unei serii de Jupyter Notebooks care conțin exerciții practice și probleme de practică țintite.
Teaches how to convert continuous numerical values into discrete bins for improved data interpretability.
Jupyter Notebookpandas
Vezi pe GitHub5,164
vega/vega-lite
vega/vega-lite
5,216Vezi pe GitHub
Vega-Lite is a high-level declarative language for specifying interactive, multi-view visualizations. It compiles a concise JSON specification into a full Vega visualization, automatically inferring scales, axes, and legends from encoding declarations. The grammar-of-graphics encoding maps data fields to visual channels such as position, color, size, and shape, while a multi-view composition grammar enables layered, faceted, concatenated, and repeated layouts. Reactive parameter binding links named parameters to input widgets, selections, and expressions for dynamic updates. The project suppo
Vega-Lite discretizes numeric values into bins for aggregation and histogram visualizations.
TypeScriptchartsdeclarative-languageplot
Vezi pe GitHub5,216
accord-net/framework
accord-net/framework
4,540Vezi pe GitHub
Acest proiect este un framework de calcul științific pentru ecosistemul .NET, oferind o suită cuprinzătoare de biblioteci pentru analiză numerică, statistică și optimizare matematică. Acesta servește ca un toolkit fundamental pentru dezvoltarea aplicațiilor în machine learning, procesarea semnalelor digitale și computer vision. Framework-ul oferă toolkit-uri specializate pentru antrenarea și implementarea modelelor predictive, inclusiv rețele neuronale, mașini cu vectori suport (SVM) și arbori de decizie. Se distinge, de asemenea, prin integrări profunde pentru analiză vizuală în timp real, cum ar fi urmărirea obiectelor și detectarea trăsăturilor faciale, alături de o bibliotecă dedicată de procesare a semnalelor digitale pentru captarea și filtrarea semnalelor audio și ale senzorilor. Suprafața de capabilități se extinde la descompunerea matricială de nivel înalt și algebră liniară, modelarea probabilistică a stărilor și algoritmi de căutare euristică. Acoperă, de asemenea, o gamă largă de utilitare pentru manipularea datelor, de la reducerea dimensionalității și normalizare până la organizarea datelor spațiale și componente de vizualizare științifică. Sistemul include controllere de integrare hardware pentru configurarea camerei, gestionarea porturilor GPIO și hardware specializat de detectare a adâncimii.
Converts continuous numerical data into discrete bins or categories for improved model interpretability.
C#
Vezi pe GitHub4,540
quantopian/alphalens
quantopian/alphalens
4,143Vezi pe GitHub
Alphalens is a quantitative alpha factor analysis library designed to measure the predictive power of financial factors. It serves as a computational toolset for processing financial time series and calculating performance metrics to evaluate quantitative trading hypotheses. The library distinguishes itself through the use of quantile-based data binning to analyze return distributions across different factor strength levels. It aligns historical alpha signals with forward-looking price changes to isolate predictive effects and transforms these metrics into heatmaps and time-series charts for
Converts continuous numerical financial signals into discrete bins or quartiles to analyze return distributions.
Jupyter Notebookalgorithmic-tradingfinancejupyter
Vezi pe GitHub4,143
rlabbe/filterpy
rlabbe/filterpy
3,772Vezi pe GitHub
filterpy is a toolkit for Bayesian state estimation, Gaussian statistical analysis, and time-series noise reduction. It provides a library of linear and non-linear Kalman filters, as well as routines for non-Gaussian state estimation and signal smoothing. The project implements a variety of estimation methods, including particle filtering using Markov Chain Monte Carlo and resampling, and discrete Bayes filtering. It also includes a suite of algorithms for refining historical state estimates through backward and fixed-lag smoothing. Additional capabilities cover multivariate Gaussian analysi
Provides utilities to discretize linear differential equations to model system behavior between measurements.
Python
Vezi pe GitHub3,772
xtensor-stack/xtensor
xtensor-stack/xtensor
3,748Vezi pe GitHub
xtensor is a C++ multidimensional array library for numerical computing that provides N-dimensional containers with an interface mirroring the NumPy API. It utilizes a lazy evaluation expression engine to defer numerical computations until assignment, which minimizes memory allocations and intermediate copies. The library features a foreign memory array adaptor that allows it to wrap external buffers, such as NumPy arrays, to perform numerical operations in-place without duplicating data. It further optimizes performance through lazy broadcasting and a system that manages the lifetime of temp
Deno-xtensor maps array values to the index of the corresponding histogram bin.
C++c-plus-plus-14multidimensional-arraysnumpy
Vezi pe GitHub3,748