awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Quantization · Awesome GitHub Repositories

4 repos

Awesome GitHub RepositoriesQuantization

Explore 4 awesome GitHub repositories matching artificial intelligence & ml · Quantization. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Machine Learning
  4. Infrastructure
  5. Model Optimization
  6. Quantization

Awesome Quantization GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • tensorflow/tensorflow

    tensorflow/tensorflow

    193,864GitHubView on GitHub↗

    TensorFlow is a comprehensive machine learning framework designed for the construction, training, and deployment of complex mathematical models. It utilizes a graph-based execution model that represents operations as directed acyclic graphs, enabling automatic differentiation and efficient parallel processing. The syst

    C++deep-learningdeep-neural-networksdistributed
  • huggingface/transformers

    huggingface/transformers

    156,730GitHubView on GitHub↗

    Transformers is a comprehensive library for machine learning that provides a unified interface for training, fine-tuning, and deploying transformer-based models. It supports a wide range of tasks, including text classification, language modeling, question answering, and sequence-to-sequence translation, while offering

    Pythonaudiodeep-learningdeepseek
  • mlabonne/llm-course

    mlabonne/llm-course

    75,340GitHubView on GitHub↗

    This project is a comprehensive educational curriculum and engineering handbook focused on the lifecycle of large language models. It serves as a structured knowledge base for machine learning practitioners, covering the fundamental mathematical and architectural principles of transformer-based sequence modeling, as we

    courselarge-language-modelsllm
  • vllm-project/vllm

    vllm-project/vllm

    70,745GitHubView on GitHub↗

    vLLM is a high-throughput inference engine designed for the efficient serving and execution of large language models. It functions as a production-ready distributed model server, providing standard API protocols for online serving while also supporting offline batch processing. The system is built to maximize token gen

    Pythonamdblackwellcuda

Explore sub-tags

  • Model QuantizationTechniques and tools for reducing the memory footprint and computational requirements of neural networks to improve inference performance.
  • Model Quantization FrameworksFrameworks that reduce model size and computational requirements by converting high-precision weights into lower-precision formats.
  • Quantization MethodsTechniques for reducing the precision of model weights to decrease memory usage and accelerate inference.
  • Quantization Plugin Interfaces
Extensible interfaces that allow developers to register custom quantization methods.