20 repository-uri
Tools and formats for saving and loading machine learning model states, optimizer parameters, and training metadata.
Distinguishing note: Focuses on the serialization format and consistency across hardware, distinct from general storage utilities.
Explore 20 awesome GitHub repositories matching artificial intelligence & ml · Model Serialization. Refine with filters or upvote what's useful.
DeepSpeed is a high-performance library designed to scale deep learning model training and inference across massive clusters of GPUs and compute nodes. It provides a comprehensive suite of tools for distributed training, enabling the execution of models that exceed the memory capacity of single devices through advanced parameter partitioning, pipeline-based model parallelism, and memory-efficient state offloading. The framework distinguishes itself through specialized communication-efficient optimizers and hardware-aware acceleration techniques. By utilizing gradient compression, quantization
Model, optimizer, and scheduler states are normalized into a consistent format to facilitate seamless saving and loading across heterogeneous hardware topologies.
Dive into LLMs is a framework designed for fine-tuning large language models and constructing modular machine learning pipelines. It provides a structured environment for adjusting pre-trained models on custom datasets while optimizing computational efficiency and training time. The project distinguishes itself by offering an interactive web interface that allows for the deployment and publication of trained models directly to a browser. This enables users to test and interact with model results through a standardized web-based environment. The platform supports the creation of flexible work
Converts trained model weights into portable file formats for execution across different environments.
spaCy is a Python natural language processing framework designed for industrial-scale text processing. It converts raw text into structured data for machine learning pipelines through a combination of statistical language model trainers, transformer-based text processors, and syntactic dependency parsers. The project enables the integration of pretrained transformer architectures to perform complex linguistic analysis and multi-task learning. It also provides a specialized system for neural named entity recognition to identify and categorize key entities within text. The framework covers a b
Bundles trained weights and configuration files into binary archives for consistent deployment.
XGBoost is a distributed machine learning library for implementing scalable gradient boosting decision trees used for regression, classification, and ranking. It functions as a predictive model framework and a cross-language toolkit, providing a core implementation with native bindings for Python, R, Java, Scala, and C++. The system is designed as a GPU-accelerated library that utilizes CUDA and NCCL to speed up the training of decision tree ensembles. It operates as a distributed framework capable of scaling training and prediction across multi-node clusters and GPU environments to process m
Provides binary model serialization that ensures consistent reproduction across different hardware and operating systems.
Save trained Keras models as artifacts and load them back for inference using dedicated functions that handle model serialization and retrieval.
minGPT is a minimal implementation of the Transformer architecture designed for training and experimenting with language models. It functions as a neural network training framework and a text generation engine, providing the necessary tools to manage data loading, backpropagation, and parameter updates for custom deep learning models. The project is structured as an educational resource for understanding how transformer architectures function by building and training models from scratch. It utilizes a modular block architecture and transformer-based self-attention to process sequences, allowi
Persists model parameters and configurations using state-dict serialization for deployment and loading.
This project serves as a comprehensive, community-driven directory of high-quality open-source Python libraries and tools for machine learning, data science, and artificial intelligence. It functions as a centralized resource for developers to discover, evaluate, and track the maintenance status of software packages across the entire machine learning ecosystem. The platform distinguishes itself through automated popularity tracking and data-driven content curation, which programmatically validate and rank projects based on community activity and development velocity. By organizing these tools
Catalogs utilities for model serialization, optimization, and production deployment.
This project is a deep learning framework designed for constructing, training, and deploying neural networks across diverse hardware environments. It functions as a high-performance tensor computation library that provides both imperative and symbolic programming interfaces, allowing developers to balance flexible, step-by-step model building with the efficiency of compiled computation graphs. The framework distinguishes itself through a hybrid execution engine that integrates declarative graph compilation with imperative runtime logic. It supports scalable, distributed training across multip
Saves and loads neural network structures and weights by serializing the underlying computation graph.
Gensim is an unsupervised natural language processing toolkit designed for topic modeling, word embedding training, and the processing of large-scale text corpora. It provides a framework for discovering latent themes and semantic structures in text without the need for labeled data. The toolkit is distinguished by its ability to handle datasets that exceed system memory through iterator-based data streaming from disk. It also supports distributed model training, allowing complex modeling tasks to be executed across computer clusters. The library covers a broad range of analysis capabilities
Provides tools for saving and loading model states to maintain session continuity.
ggml is a low-level C++ tensor library and machine learning inference engine designed for performing mathematical operations on multi-dimensional arrays across diverse hardware platforms. It provides a foundational toolset for executing machine learning models and calculating mathematical gradients through an automatic differentiation library. The project features a quantized tensor framework that converts floating-point weights into integer representations to reduce memory usage and increase inference speed. It utilizes a custom binary format for model serialization to ensure rapid loading a
Implements binary formats for saving and loading model states and metadata to ensure consistency across hardware.
OpenVINO is an AI inference engine and model serving platform designed to execute optimized deep learning models across CPUs, GPUs, and NPUs through a unified API. It includes a model optimization toolkit for converting, quantizing, and compressing models from various frameworks, alongside a specialized generative AI runtime for large language models. The project distinguishes itself through a plugin-based hardware acceleration layer that maps neural network operations to vendor-specific drivers. It features advanced execution mechanisms such as continuous batching, speculative decoding, and
Saves converted models to files to reduce load latency and shrink storage size.
This project is a comprehensive collection of educational examples and reference implementations for building vision and language models using PyTorch. It serves as a deep learning tutorial covering the end-to-end process of developing neural networks, from initial architecture definition to final production deployment. The repository provides detailed guides on implementing a wide range of domain-specific models, including convolutional neural networks for object detection and segmentation, as well as transformer and recurrent architectures for natural language processing. It emphasizes gene
Transforms model files into serialized engine or plan files to optimize the deployment process.
h2o-3 is a distributed machine learning platform and automated machine learning framework designed for training and deploying predictive models using distributed in-memory computing. It functions as a deep learning framework and a distributed model scoring engine, capable of operating as a Kubernetes ML cluster to process large datasets in parallel. The platform distinguishes itself through automated machine learning capabilities that automatically select the best algorithms and hyperparameters to optimize model performance. It provides specialized deep learning toolkits for tasks including i
Exports trained models as binary artifacts for high-performance scoring without requiring the full runtime environment.
BERTopic is a topic modeling library used to extract interpretable themes from collections of text documents and images. It functions as a document clustering framework that transforms unstructured data into numerical vectors to group semantically similar content. The project distinguishes itself through a multimodal embedding tool that allows for joint clustering of text and images in a shared vector space. It also features a class-based TF-IDF representation engine to identify representative words for clusters and an integrated system for using large language models to generate natural lang
Uses optimized serialization formats to reduce model file size and increase loading speed.
Flax is a deep learning framework and JAX neural network library designed for building complex machine learning models. It functions as a distributed training library and model state manager, providing a toolkit for defining flexible neural network architectures and scaling their training across multiple hardware devices. The project is characterized by a design that separates network logic from parameter values to remain compatible with pure functions. It uses hierarchical module composition to organize networks as trees of nested modules and employs a reference-based state management system
Provides tools and formats for saving and loading machine learning model states and optimizer parameters.
Smile is a comprehensive JVM machine learning library and statistical computing toolkit. It provides a suite of algorithms for classification, regression, and clustering, implemented natively for Java, Scala, and Kotlin. The project also functions as a deep learning framework, a natural language processing library, and an inference engine for large language models. The library distinguishes itself through GPU acceleration via LibTorch bindings and support for the ONNX model interchange format. It includes specialized capabilities for large language model inference, featuring Byte-Pair Encodin
Saves trained machine learning models to disk for deployment and pipeline integration.
Chainer is an open-source deep learning framework built around define-by-run automatic differentiation, where computation graphs are constructed dynamically during forward execution. This imperative approach allows networks to be built using standard Python control flow, with gradients computed automatically through reverse-mode differentiation on the dynamically recorded graph. The framework supports GPU acceleration through a NumPy-compatible array backend with CUDA and cuDNN support, and provides a pluggable device abstraction that lets users switch between CPU and GPU computation without c
Saves and loads model parameters and optimizer states for portable storage, inference, and continued training.
cuml este o bibliotecă de machine learning accelerată pe GPU și un framework care utilizează CUDA pentru a accelera preprocesarea datelor tabelare și execuția modelelor. Oferă o suită de instrumente pentru antrenarea și implementarea modelelor de clasificare, regresie și clustering pe GPU-uri NVIDIA și clustere GPU. Biblioteca este concepută pentru scalabilitate, oferind un mediu de machine learning GPU distribuit care poate răspândi calculul și datele pe mai multe acceleratoare hardware și noduri pentru a gestiona seturi de date care depășesc memoria unui singur dispozitiv. Oglindește interfețele standard ale estimatorilor pentru a permite înlocuirea modelelor bazate pe CPU cu versiuni accelerate pe GPU în cadrul fluxurilor de lucru existente. Proiectul acoperă o gamă largă de capabilități de machine learning, incluzând învățarea supervizată, clustering-ul nesupervizat, căutarea celui mai apropiat vecin și reducerea dimensionalității de înaltă dimensiune. Include, de asemenea, preprocesarea datelor tabelare accelerată hardware pentru scalarea și codificarea caracteristicilor, extracția caracteristicilor textuale, analiza seriilor temporale și explicabilitatea predicțiilor modelului. Utilitarele de suport includ instrumente pentru generarea de seturi de date sintetice, serializarea stării modelului și calcularea metricilor de performanță ale modelului.
Provides tools and formats for saving and loading machine learning model states and training metadata for persistence.
Acest proiect este un framework de calcul științific pentru ecosistemul .NET, oferind o suită cuprinzătoare de biblioteci pentru analiză numerică, statistică și optimizare matematică. Acesta servește ca un toolkit fundamental pentru dezvoltarea aplicațiilor în machine learning, procesarea semnalelor digitale și computer vision. Framework-ul oferă toolkit-uri specializate pentru antrenarea și implementarea modelelor predictive, inclusiv rețele neuronale, mașini cu vectori suport (SVM) și arbori de decizie. Se distinge, de asemenea, prin integrări profunde pentru analiză vizuală în timp real, cum ar fi urmărirea obiectelor și detectarea trăsăturilor faciale, alături de o bibliotecă dedicată de procesare a semnalelor digitale pentru captarea și filtrarea semnalelor audio și ale senzorilor. Suprafața de capabilități se extinde la descompunerea matricială de nivel înalt și algebră liniară, modelarea probabilistică a stărilor și algoritmi de căutare euristică. Acoperă, de asemenea, o gamă largă de utilitare pentru manipularea datelor, de la reducerea dimensionalității și normalizare până la organizarea datelor spațiale și componente de vizualizare științifică. Sistemul include controllere de integrare hardware pentru configurarea camerei, gestionarea porturilor GPIO și hardware specializat de detectare a adâncimii.
Saves and loads machine learning model states and weights to disk using configurable compression.
Minigo is a TensorFlow-based reinforcement learning engine designed to master the game of Go. It functions as a comprehensive system for training neural networks to predict board policies and game outcomes, utilizing a model trainer to generate self-play data and optimize weights. The project is distinguished by its ability to perform large-scale game simulations using Kubernetes to distribute worker nodes across CPU, GPU, and TPU hardware. It employs a Monte Carlo Tree Search implementation to identify optimal moves and supports specialized hardware acceleration, including inference on Edge
The Go AI writes a serialized model and associated metadata to local or cloud storage.