What are the best open-source alternatives to Neural Compressor?

30 open-source projects similar to intel/neural-compressor, ranked by shared features. Top picks: vllm-project/llm-compressor, nvidia/isaac-gr00t, pytorch/torchtune, paddlepaddle/paddle-lite, nebuly-ai/nebullvm, autogptq/autogptq, alibaba/mnn, tracel-ai/burn, dusty-nv/jetson-inference, paddlepaddle/paddlenlp.

Is vllm-project/llm-compressor a good alternative to Neural Compressor?

llm-compressor is a quantization toolkit and post-training library designed to reduce the memory footprint and size of large language models. It provides a framework for compressing models using weight and activation quantization to enable more efficient deployment. The project distinguishes itsel…

Is nvidia/isaac-gr00t a good alternative to Neural Compressor?

nvidia/isaac-gr00t is an open-source alternative to Neural Compressor.

Is pytorch/torchtune a good alternative to Neural Compressor?

Torchtune is a PyTorch-native library for fine-tuning, aligning, and quantizing large language models. It provides a configurable training pipeline orchestrated through YAML recipes, with CLI overrides and component swapping, distributed training via FSDP2, memory optimizations, and parameter-effic…

Is paddlepaddle/paddle-lite a good alternative to Neural Compressor?

Paddle-Lite is a deep learning inference engine and edge computing runtime designed to execute trained models on mobile and edge devices. It provides a hardware-accelerated inference framework and a decoupled runtime with a minimal binary footprint to operate in resource-constrained environments wi…

Is nebuly-ai/nebullvm a good alternative to Neural Compressor?

Nebullvm is an AI inference accelerator, GPU resource orchestrator, and performance optimization library for large language models. It functions as an optimization layer designed to lower operational costs by aligning model execution with underlying hardware architectures. The system maximizes clu…

Is autogptq/autogptq a good alternative to Neural Compressor?

AutoGPTQ is a model compression toolkit and post-training quantization framework designed to reduce the memory footprint of large language models. It utilizes the GPTQ algorithm to compress neural network weights, lowering hardware requirements and reducing VRAM usage. The project serves as an inf…

Is alibaba/mnn a good alternative to Neural Compressor?

MNN is a high-performance inference engine and framework designed for on-device machine learning. It provides a comprehensive environment for executing, optimizing, and deploying neural network models directly on mobile and resource-constrained edge devices. The framework distinguishes itself thro…

Is tracel-ai/burn a good alternative to Neural Compressor?

Burn is a deep learning framework designed for building, training, and deploying neural networks using a modular architecture. As a machine learning library built in Rust, it provides a backend-agnostic computational engine that enables the execution of models across diverse hardware, including cen…

Is dusty-nv/jetson-inference a good alternative to Neural Compressor?

jetson-inference is a set of libraries and tools for executing optimized deep learning models on embedded GPU hardware. Its primary purpose is to enable real-time computer vision and AI inference at the edge with low latency and high throughput. The project distinguishes itself through high-perfor…

Is paddlepaddle/paddlenlp a good alternative to Neural Compressor?

PaddleNLP is a development library and toolkit for training, fine-tuning, and deploying large and small language models using the PaddlePaddle framework. It provides a comprehensive suite for the entire natural language processing lifecycle, from model development to high-performance inference. Th…

Back to intel/neural-compressor

Open-source alternatives to Neural Compressor

30 open-source projects similar to intel/neural-compressor, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Neural Compressor alternative.

vllm-project/llm-compressor
vllm-project/llm-compressor
2,764View on GitHub
llm-compressor is a quantization toolkit and post-training library designed to reduce the memory footprint and size of large language models. It provides a framework for compressing models using weight and activation quantization to enable more efficient deployment. The project distinguishes itself through a distributed quantization framework that utilizes data-parallel processing and disk-based weight offloading to handle massive model checkpoints that exceed available system memory. It includes specialized compressors for diverse architectures, including Mixture-of-Experts, Vision-Language,
Pythoncompressionquantizationsparsity
View on GitHub2,764
nvidia/isaac-gr00t
NVIDIA/Isaac-GR00T
6,222View on GitHub
Jupyter Notebook
View on GitHub6,222
pytorch/torchtune
pytorch/torchtune
5,774View on GitHub
Torchtune is a PyTorch-native library for fine-tuning, aligning, and quantizing large language models. It provides a configurable training pipeline orchestrated through YAML recipes, with CLI overrides and component swapping, distributed training via FSDP2, memory optimizations, and parameter-efficient fine-tuning methods like LoRA, DoRA, and QLoRA. The library distinguishes itself through its YAML-driven configuration system that defines all training parameters and instantiates components from config files, with full CLI override capability for any field or component at launch time. It suppo
Python
View on GitHub5,774
paddlepaddle/paddle-lite
PaddlePaddle/Paddle-Lite
7,260View on GitHub
Paddle-Lite is a deep learning inference engine and edge computing runtime designed to execute trained models on mobile and edge devices. It provides a hardware-accelerated inference framework and a decoupled runtime with a minimal binary footprint to operate in resource-constrained environments without third-party dependencies. The project includes a model quantization tool for reducing precision and size via static and dynamic quantization, as well as a computation graph optimizer. These tools reduce latency and memory usage by fusing operators and pruning the model intermediate representat
C++armbaidudeep-learning
View on GitHub7,260

Open-source alternatives to Neural Compressor

vllm-project/llm-compressor

NVIDIA/Isaac-GR00T

pytorch/torchtune

PaddlePaddle/Paddle-Lite

nebuly-ai/nebullvm

AutoGPTQ/AutoGPTQ

alibaba/MNN

tracel-ai/burn

dusty-nv/jetson-inference

PaddlePaddle/PaddleNLP

pytorch/examples

openvinotoolkit/openvino

pytorch/executorch

facebookresearch/metaseq

Lightning-AI/litgpt

apple/ml-fastvlm

PanQiWei/AutoGPTQ

timdettmers/bitsandbytes

google-ai-edge/LiteRT

TingsongYu/PyTorch_Tutorial

pytorch/vision

intel/ipex-llm

bitsandbytes-foundation/bitsandbytes

Tencent/PocketFlow

divamgupta/diffusionbee-stable-diffusion-ui

ggml-org/whisper.cpp

NVIDIA/TensorRT

intel-analytics/ipex-llm

meta-llama/llama-models

OpenNMT/CTranslate2