7 repository-uri
Utilities for transforming deep learning models into optimized formats for inference.
Distinguishing note: Focuses on model format optimization for deployment.
Explore 7 awesome GitHub repositories matching artificial intelligence & ml · Model Conversion Pipelines. Refine with filters or upvote what's useful.
InsightFace is a comprehensive deep learning framework designed for face recognition, biometric identity verification, and feature extraction. It provides a specialized engine for one-to-one verification and one-to-many identification tasks, utilizing convolutional neural networks to transform raw image pixels into high-dimensional vector embeddings. The project includes a complete toolkit for detecting, aligning, and processing facial data to ensure consistent identity discrimination. Beyond core recognition, the platform distinguishes itself through an extensive model management and optimiz
Transforms deep learning models into optimized formats for high-performance inference.
This project is a framework for running Stable Diffusion image generation models on Apple Silicon using Core ML hardware acceleration. It provides a local generative AI pipeline for producing images from text prompts using Swift and Python without relying on external cloud APIs. The system includes a model converter to transform deep learning checkpoints into Core ML formats and a model optimizer to quantize weights and activations. It features a ControlNet integration layer to guide image generation using external signals such as edge and depth maps. Capabilities cover text-to-image generat
Transforms deep learning checkpoints into specialized Core ML formats for optimized execution on Apple hardware.
Ktransformers is a comprehensive framework designed for the operation, fine-tuning, and serving of large language models. It functions as a heterogeneous inference engine and quantized execution runtime, enabling the deployment of massive models by distributing computational workloads across both CPU and GPU resources. This architecture allows users to bypass local memory constraints, making it possible to run and train models that exceed the capacity of a single device. The project distinguishes itself through specialized support for sparse architectures, particularly mixture-of-experts mode
Merges expert and non-expert model weights into unified formats compatible with high-performance serving engines.
This project is a collection of pre-trained machine learning models and conversion pipelines designed for running inference directly in the browser using TensorFlow.js. It provides a library of ready-to-use models for computer vision, audio classification, and natural language processing tasks. The suite includes specialized tools for transforming Python-based Keras models into JSON formats compatible with web environments. It enables the deployment of these models by fetching architectures and weight shards via HTTP for client-side execution. The project covers a broad range of capabilities
Transforms Python-based Keras models into JSON formats compatible with browser-based execution.
Fauxpilot is a self-hosted AI coding assistant and local inference server. It functions as a proxy and API gateway that redirects traffic from IDE plugins to a local large language model, allowing for AI-assisted programming without external cloud dependencies. The project provides a specialized API emulation layer that mimics coding assistant protocols and a standardized OpenAI-compatible interface. This enables supported code editors to use local models for completions and suggestions by overriding default proxy URLs. The system includes capabilities for downloading and deploying local mod
Transforms deep learning model files into optimized formats required by specific inference engines.
MNN is a high-performance inference engine and framework designed for on-device machine learning. It provides a comprehensive environment for executing, optimizing, and deploying neural network models directly on mobile and resource-constrained edge devices. The framework distinguishes itself through a robust model optimization toolkit that supports quantization, compression, and structural graph manipulation to minimize memory footprint and maximize execution speed. It features a modular architecture that abstracts hardware-specific backends, allowing models to run efficiently across diverse
Includes an offline conversion pipeline to translate external model formats into optimized binary representations for local execution.
This project is a comprehensive collection of educational examples and reference implementations for building vision and language models using PyTorch. It serves as a deep learning tutorial covering the end-to-end process of developing neural networks, from initial architecture definition to final production deployment. The repository provides detailed guides on implementing a wide range of domain-specific models, including convolutional neural networks for object detection and segmentation, as well as transformer and recurrent architectures for natural language processing. It emphasizes gene
Transforms common model files into optimized runtime representations for high-performance inference.