Why is jingyaogong/minimind a recommended Hyperparameter Configurations GitHub Repositories repository?

The framework allows for the configuration of model hyperparameters, such as embedding dimensions and layer counts, to balance training stability, inference speed, and performance.

Why is tensorflow/tensor2tensor a recommended Hyperparameter Configurations GitHub Repositories repository?

Uses structured configuration files to define and manage model hyperparameters for automated parallel search.

Why is speechbrain/speechbrain a recommended Hyperparameter Configurations GitHub Repositories repository?

Organizes training experiments and model settings using a structured configuration language to streamline development workflows.

Why is google/dopamine a recommended Hyperparameter Configurations GitHub Repositories repository?

Provides tools for managing and tuning model hyperparameters via external files for reproducible research.

Why is tingsongyu/pytorch_tutorial a recommended Hyperparameter Configurations GitHub Repositories repository?

Manages model hyperparameters through a command-line argument parser to facilitate tuning without code changes.

Why is antixk/pytorch-vae a recommended Hyperparameter Configurations GitHub Repositories repository?

Provides a system for managing model hyperparameters via external configuration files to ensure experimental reproducibility.

Why is eleutherai/gpt-neox a recommended Hyperparameter Configurations GitHub Repositories repository?

Enables the definition of model hyperparameters such as layer count, hidden size, and attention head configurations.

Why is poloclub/transformer-explainer a recommended Hyperparameter Configurations GitHub Repositories repository?

Allows updating model configuration variables in memory during runtime to observe immediate changes in learning behavior.

Why is open-mmlab/mmdetection3d a recommended Hyperparameter Configurations GitHub Repositories repository?

Enables adapting 3D detection pipelines to new datasets through configuration files.

Why is zihangdai/xlnet a recommended Hyperparameter Configurations GitHub Repositories repository?

Provides configurations for layer counts, attention heads, and hidden sizes to ensure consistency across training phases.

17 repository-uri

Awesome GitHub RepositoriesHyperparameter Configurations

Tools for managing and tuning model parameters to optimize training and inference performance.

Distinguishing note: Focuses on the configuration of model parameters rather than the architecture definition itself.

Explore 17 awesome GitHub repositories matching artificial intelligence & ml · Hyperparameter Configurations. Refine with filters or upvote what's useful.

Găsește cele mai bune repo-uri cu AI.Vom căuta cele mai potrivite repository-uri folosind AI.

jingyaogong/minimind
jingyaogong/minimind
51,834Vezi pe GitHub
This project is a comprehensive framework for the entire lifecycle of transformer-based language models, supporting everything from foundational pretraining to specialized deployment. It provides a modular toolkit for defining neural network architectures, managing data preparation pipelines, and executing training routines across various scales. The framework is designed to handle the full model development process, including supervised fine-tuning, behavioral alignment, and the integration of agentic capabilities. What distinguishes this framework is its focus on efficient training and adva
The framework allows for the configuration of model hyperparameters, such as embedding dimensions and layer counts, to balance training stability, inference speed, and performance.
Pythonartificial-intelligencelarge-language-model
Vezi pe GitHub51,834
tensorflow/tensor2tensor
tensorflow/tensor2tensor
17,009Vezi pe GitHub
Tensor2Tensor is a deep learning library built on TensorFlow designed for training and evaluating complex machine learning models. It provides a unified framework for managing the entire model lifecycle, including data ingestion, training execution, and performance evaluation across diverse hardware environments. The library distinguishes itself through a modular architecture that supports multimodal data processing, allowing for the simultaneous analysis of text, audio, and image inputs. It features a central registry system that enables developers to extend the framework with custom models,
Uses structured configuration files to define and manage model hyperparameters for automated parallel search.
Pythondeep-learningmachine-learningmachine-translation
Vezi pe GitHub17,009
speechbrain/speechbrain
speechbrain/speechbrain
11,624Vezi pe GitHub
SpeechBrain is an all-in-one deep learning toolkit designed for speech and audio processing. Built as a modular library, it provides a structured environment for developing, training, and deploying neural network models across a wide range of tasks, including automatic speech recognition, speaker identification, and audio enhancement. The framework distinguishes itself through a configuration-driven approach that separates model architecture and training hyperparameters from application logic. By utilizing externalized configuration files and standardized recipes, it enables reproducible rese
Organizes training experiments and model settings using a structured configuration language to streamline development workflows.
Pythonasraudioaudio-processing
Vezi pe GitHub11,624
google/dopamine
google/dopamine
10,879Vezi pe GitHub
Dopamine is a reinforcement learning research framework designed for prototyping and testing algorithms across diverse simulated environments. It provides an agent development toolkit that utilizes a flat class hierarchy to facilitate the creation and extension of learning agents. The framework includes a standardization layer via environment wrappers that connect agents to various physics simulations and gaming environments. It also features a high-performance experience replay buffer for storing and sampling transition data to improve training stability, alongside a dedicated hyperparameter
Provides tools for managing and tuning model hyperparameters via external files for reproducible research.
Jupyter Notebook
Vezi pe GitHub10,879
tingsongyu/pytorch_tutorial
TingsongYu/PyTorch_Tutorial
8,018Vezi pe GitHub
This project is a comprehensive collection of educational examples and reference implementations for building vision and language models using PyTorch. It serves as a deep learning tutorial covering the end-to-end process of developing neural networks, from initial architecture definition to final production deployment. The repository provides detailed guides on implementing a wide range of domain-specific models, including convolutional neural networks for object detection and segmentation, as well as transformer and recurrent architectures for natural language processing. It emphasizes gene
Manages model hyperparameters through a command-line argument parser to facilitate tuning without code changes.
Python
Vezi pe GitHub8,018
antixk/pytorch-vae
AntixK/PyTorch-VAE
7,650Vezi pe GitHub
This project is a deep learning research toolkit and generative model library providing implementations of Variational Autoencoders using the PyTorch framework. It serves as a framework for training and evaluating autoencoder architectures to learn latent representations for data reconstruction and the generation of synthetic data samples. The toolkit focuses on unsupervised feature learning and generative model training, featuring a system for mapping external configuration files to model hyperparameters to ensure reproducible experimental runs. It includes mechanisms for tracking training p
Provides a system for managing model hyperparameters via external configuration files to ensure experimental reproducibility.
Pythonarchitecturebeta-vaeceleba-dataset
Vezi pe GitHub7,650
eleutherai/gpt-neox
EleutherAI/gpt-neox
7,392Vezi pe GitHub
gpt-neox is a distributed training system and framework for building large-scale autoregressive language models. It implements the transformer architecture and provides a toolkit for training models with billions of parameters by distributing weights across compute clusters. The framework distinguishes itself through extensive support for distributed model parallelism, including pipeline and sequence parallelism, to overcome single-device memory limits. It further supports sparse model architectures using a mixture of experts system with Sinkhorn-based routing. The project covers a broad ran
Enables the definition of model hyperparameters such as layer count, hidden size, and attention head configurations.
Pythondeepspeed-librarygpt-3language-model
Vezi pe GitHub7,392
poloclub/transformer-explainer
poloclub/transformer-explainer
6,790Vezi pe GitHub
This project is a collection of interactive graphical tools designed for monitoring neural network training, latent space mappings, and the internal mechanisms of transformers. It functions as a visual learning environment for understanding how large language models process tokens and an educational tool for analyzing the interactions between generators and discriminators within adversarial networks. The system provides a browser-based transformer architecture visualizer to show the mathematical operations used for token prediction in real time. It also includes a generative adversarial netwo
Allows updating model configuration variables in memory during runtime to observe immediate changes in learning behavior.
JavaScriptdeep-learninggenerative-aigpt
Vezi pe GitHub6,790
open-mmlab/mmdetection3d
open-mmlab/mmdetection3d
6,273Vezi pe GitHub
MMDetection3D is an open-source toolbox for 3D perception, providing a unified framework for detecting and segmenting objects in three-dimensional environments. It supports a range of core tasks including monocular 3D object detection from single camera images, LiDAR-based 3D object detection from raw point clouds, and multi-modal fusion that combines camera images with LiDAR data. The toolbox also covers point cloud semantic segmentation, assigning class labels to every point in a scan for scene understanding. The project distinguishes itself through a config-driven pipeline that orchestrate
Enables adapting 3D detection pipelines to new datasets through configuration files.
Python3d-object-detectionobject-detectionpoint-cloud
Vezi pe GitHub6,273
zihangdai/xlnet
zihangdai/xlnet
6,182Vezi pe GitHub
Acest proiect este un framework de procesare a limbajului natural axat pe un pre-antrenor autoregresiv generalizat conceput pentru reprezentarea limbajului nesupervizat. Implementează un model de limbaj care combină antrenamentul bazat pe permutare cu un backbone Transformer-XL pentru a funcționa ca un procesor de text cu context lung. Sistemul se distinge prin capacitatea de a gestiona secvențe de text care depășesc limitele standard de lungime prin utilizarea recurenței la nivel de segment și a codificării poziționale relative. Acesta scalează pre-antrenamentul de înaltă performanță pe mai multe GPU-uri și clustere TPU folosind implementări de antrenament distribuit. Codul sursă acoperă întregul flux de lucru de machine learning, inclusiv curățarea textului și tokenizarea subcuvintelor pentru preprocesarea datelor, precum și fine-tuning-ul specific sarcinii pentru răspunsul la întrebări, înțelegerea lecturii și clasificarea textului. Include utilitare pentru optimizarea parametrilor, programarea ratei de învățare și evaluarea probabilităților de răspuns prin metrici de precizie-rechemare. Proiectul oferă configurații pentru gestionarea hiperparametrilor modelului și antrenamentul accelerat hardware pe mai multe gazde.
Provides configurations for layer counts, attention heads, and hidden sizes to ensure consistency across training phases.
Python
Vezi pe GitHub6,182
facebookresearch/mmf
facebookresearch/mmf
5,635Vezi pe GitHub
MMF is a modular framework for building, training, and evaluating vision-and-language models. It provides a configuration-driven experiment system where model, dataset, and training parameters are defined through composable YAML files, alongside a curated model zoo of pretrained checkpoints for state-of-the-art multimodal architectures. The framework includes a multimodal dataset loader that downloads, processes, and batches vision-and-language data, and a vision-language model trainer supporting distributed training, mixed precision, and checkpoint-based resumption. The framework distinguish
Defines encoder types, dimensions, and classifier settings through structured configuration files.
Pythoncaptioningdeep-learningdialog
Vezi pe GitHub5,635
ashleve/lightning-hydra-template
ashleve/lightning-hydra-template
5,303Vezi pe GitHub
Acest proiect este un boilerplate standardizat pentru experimente de machine learning și un template de proiect care combină PyTorch Lightning cu framework-ul de configurare Hydra. Oferă un codebase structurat pentru organizarea fluxurilor de lucru de deep learning, conceput special pentru a integra gestionarea ierarhică a configurațiilor cu antrenarea distribuită. Template-ul dispune de un flux de lucru specializat pentru optimizarea hiperparametrilor și execuția experimentelor în loturi (batch), permițând scanări automate ale parametrilor fără a modifica codul sursă. Utilizează un sistem ierarhic pentru gestionarea setărilor prin fișiere YAML și override-uri din linia de comandă pentru a asigura rezultate reproductibile în diferite rulări ale experimentelor. Proiectul acoperă arii largi de capabilități, inclusiv antrenarea distribuită de deep learning pe mai multe acceleratoare hardware, încapsularea pipeline-ului de date și logarea experimentelor multi-backend. De asemenea, integrează automatizarea calității codului prin pre-commit hooks, lintere și formattatoare, alături de instrumente pentru gestionarea checkpoint-urilor modelelor și evaluare.
Tracks sets of hyperparameters through dedicated configurations to maintain a history of optimal settings.
Pythonbest-practicesconfigdeep-learning
Vezi pe GitHub5,303
victoresque/pytorch-template
victoresque/pytorch-template
5,116Vezi pe GitHub
Acest proiect este un boilerplate pentru PyTorch și un framework de antrenare conceput pentru a standardiza dezvoltarea experimentelor de deep learning. Oferă un layout de director structurat și un set de clase de bază pentru a iniția noi proiecte, asigurând un flux de lucru consistent de la construcția pipeline-ului de date până la execuția modelului. Framework-ul se distinge printr-un manager de configurare centralizat pentru hiperparametri care suportă suprascrieri din linia de comandă și un strat de accelerare hardware pentru distribuirea sarcinilor computaționale pe mai multe unități de procesare grafică (GPU). De asemenea, implementează un strat de orchestrare bazat pe clase pentru a automatiza amestecarea seturilor de date (shuffling), generarea de batch-uri și împărțirea pentru validare. Sistemul acoperă o gamă largă de capabilități de antrenare, inclusiv logarea automatizată a metricilor, serializarea stării bazată pe checkpoint-uri pentru reluarea antrenării și determinismul rezultatelor prin sincronizarea seed-urilor. Include, de asemenea, instrumente pentru monitorizarea progresului antrenării și implementarea opririi timpurii (early stopping) bazată pe benchmark-uri de performanță.
Provides tools for managing and tuning model hyperparameters via configuration files and command-line flags.
Python
Vezi pe GitHub5,116
datawhalechina/tiny-universe
datawhalechina/tiny-universe
4,505Vezi pe GitHub
Tiny Universe is an educational monorepo that delivers multiple independent implementations of core AI subsystems as self-contained Jupyter notebooks. It provides from-scratch constructions of foundational architectures including a complete Transformer model built from the original paper specification, a denoising diffusion probabilistic model for image generation, and a ReAct-style autonomous agent framework that equips an LLM with tools for planning and multi-step task execution. The project distinguishes itself by covering the full lifecycle of modern AI systems through hands-on implementa
Configures model hyperparameters like vocabulary size, hidden dimensions, and layer counts.
Jupyter Notebookagentdiffusionevaluation-metrics
Vezi pe GitHub4,505
goodfeli/adversarial
goodfeli/adversarial
4,074Vezi pe GitHub
Acest proiect este o implementare de rețea adversară generativă și un framework de cercetare. Oferă instrumentele și hiperparametrii necesari pentru a antrena și evalua modele generative pe diverse seturi de date, fiind conceput special pentru a reproduce rezultatele din cercetările academice. Framework-ul include un estimator de verosimilitate a densității Parzen pentru a calcula log-verosimilitatea modelului. Acest lucru permite evaluarea cantitativă a distribuțiilor generative și măsurarea performanței generale a modelului. Codul sursă acoperă capabilități de cercetare în machine learning, concentrându-se pe antrenarea rețelelor adverse și evaluarea distribuțiilor de date sintetice.
Provides tools for managing and tuning hyperparameters via external configuration files for research reproducibility.
Python
Vezi pe GitHub4,074
dlr-rm/rl-baselines3-zoo
DLR-RM/rl-baselines3-zoo
2,725Vezi pe GitHub
This project is a collection of pretrained reinforcement learning agents and training scripts built on Stable Baselines3 and Gymnasium. It provides a framework for training agents to solve specific tasks, managing experiment reproducibility, and deploying pretrained models. The system includes a specialized benchmarking suite and optimization tools for tuning agent settings. It utilizes automated search spaces and distributed trials to maximize performance, while employing bootstrap sampling to generate statistically robust performance metrics and confidence intervals. Broad capabilities cov
Enables defining learning rates and custom policies via external configuration files to tune agent behavior.
Pythondeep-reinforcement-learninggymhyperparameter-optimization
Vezi pe GitHub2,725
willkoehrsen/machine-learning-project-walkthrough
WillKoehrsen/machine-learning-project-walkthrough
1,281Vezi pe GitHub
Acest proiect este o resursă educațională și un ghid pas cu pas pentru implementarea fluxurilor de lucru end-to-end de machine learning. Oferă o prezentare structurată pentru gestionarea întregului ciclu de viață al unui proiect de modelare predictivă, de la curățarea inițială a datelor și feature engineering până la antrenarea finală a modelului și evaluarea performanței. Repository-ul utilizează documente interactive pentru a intercala codul, vizualizările de date și explicațiile narative, facilitând o abordare reproductibilă a științei datelor. Urmând această secvență ghidată, utilizatorii pot construi și orchestra pipeline-uri care transformă datele brute în modele predictive validate folosind seturi de date din lumea reală. Fluxul de lucru acoperă etapele de bază ale dezvoltării machine learning, inclusiv procesarea datelor orientată pe pipeline și evaluarea modulară a modelelor. Acesta pune accent pe procese repetabile, permițând experimentarea iterativă prin persistența datelor cu stare și configurarea declarativă a parametrilor modelului.
Provides declarative configuration of model parameters to optimize predictive outcomes.
Jupyter Notebook
Vezi pe GitHub1,281