3 repos

Awesome GitHub RepositoriesModel Quantization

Methods for reducing the precision of model weights to decrease memory usage and accelerate inference.

Explore 3 awesome GitHub repositories matching artificial intelligence & ml · Model Quantization. Refine with filters or upvote what's useful.

We'll search the best matching repositories with AI.

tensorflow/tensorflow
tensorflow/tensorflow
193,864GitHubView on GitHub
TensorFlow is a comprehensive machine learning framework designed for the construction, training, and deployment of complex mathematical models. It utilizes a graph-based execution model that represents operations as directed acyclic graphs, enabling automatic differentiation and efficient parallel processing. The syst
C++deep-learningdeep-neural-networksdistributed
huggingface/transformers
huggingface/transformers
156,730GitHubView on GitHub
Transformers is a comprehensive library for machine learning that provides a unified interface for training, fine-tuning, and deploying transformer-based models. It supports a wide range of tasks, including text classification, language modeling, question answering, and sequence-to-sequence translation, while offering
Pythonaudiodeep-learningdeepseek
vllm-project/vllm
vllm-project/vllm
70,745GitHubView on GitHub
vLLM is a high-throughput inference engine designed for the efficient serving and execution of large language models. It functions as a production-ready distributed model server, providing standard API protocols for online serving while also supporting offline batch processing. The system is built to maximize token gen
Pythonamdblackwellcuda

3 repos

Methods for reducing the precision of model weights to decrease memory usage and accelerate inference.

Explore 3 awesome GitHub repositories matching artificial intelligence & ml · Model Quantization. Refine with filters or upvote what's useful.

We'll search the best matching repositories with AI.

tensorflow/tensorflow
tensorflow/tensorflow
193,864GitHubView on GitHub
TensorFlow is a comprehensive machine learning framework designed for the construction, training, and deployment of complex mathematical models. It utilizes a graph-based execution model that represents operations as directed acyclic graphs, enabling automatic differentiation and efficient parallel processing. The syst
C++deep-learningdeep-neural-networksdistributed
huggingface/transformers
huggingface/transformers
156,730GitHubView on GitHub
Transformers is a comprehensive library for machine learning that provides a unified interface for training, fine-tuning, and deploying transformer-based models. It supports a wide range of tasks, including text classification, language modeling, question answering, and sequence-to-sequence translation, while offering
Pythonaudiodeep-learningdeepseek
vllm-project/vllm
vllm-project/vllm
70,745GitHubView on GitHub
vLLM is a high-throughput inference engine designed for the efficient serving and execution of large language models. It functions as a production-ready distributed model server, providing standard API protocols for online serving while also supporting offline batch processing. The system is built to maximize token gen
Pythonamdblackwellcuda

Awesome Model Quantization GitHub Repositories