awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Inference Optimization and Tuning · Awesome GitHub Repositories

6 repos

Awesome GitHub RepositoriesInference Optimization and Tuning

Explore 6 awesome GitHub repositories matching artificial intelligence & ml · Inference Optimization and Tuning. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Machine Learning
  4. Infrastructure
  5. Deployment & Serving
  6. Inference Optimization and Tuning

Awesome Inference Optimization and Tuning GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • deepseek-ai/DeepSeek-V3

    deepseek-ai/DeepSeek-V3

    101,631GitHubView on GitHub↗

    DeepSeek-V3 is a large language model that provides comprehensive resources for model utilization, including technical specifications, pre-trained weights, and evaluation benchmarks. The project details the core transformer architecture, including parameter counts and multi-token prediction modules, while supporting na

    Optimized execution paths leverage specialized hardware accelerators to support efficient half-precision inference.

    Python
  • browser-use/browser-use

    browser-use/browser-use

    78,576GitHubView on GitHub↗

    Browser-use is a framework for building autonomous agents that navigate, interact with, and extract data from web interfaces using natural language instructions. By acting as an orchestration layer between large language models and browser automation protocols, it enables the execution of complex, multi-step workflows

    Adjusts operational behavior and inference parameters for Llama models to optimize their performance in web-based reasoning tasks.

    Pythonai-agentsai-toolsbrowser-automation
  • hoppscotch/hoppscotch

    hoppscotch/hoppscotch

    77,888GitHubView on GitHub↗

    Hoppscotch is an open-source API development ecosystem designed for building, testing, and debugging REST, GraphQL, and real-time APIs. It provides a unified platform that functions across web browsers, desktop applications, and command-line interfaces, allowing developers to manage the entire API lifecycle from a sing

    Configures AI-driven assistance to generate payloads and automate test script creation.

    TypeScriptapiapi-clientapi-rest
  • nomic-ai/gpt4all

    nomic-ai/gpt4all

    77,146GitHubView on GitHub↗

    GPT4All is a cross-platform runtime environment designed to execute large language models directly on local consumer hardware. By leveraging an optimized C++ inference backend, it enables private, offline AI interactions without requiring an internet connection or external cloud services. The project provides a compreh

    Provides granular controls for adjusting inference parameters, hardware acceleration settings, and model-specific execution behaviors.

    C++ai-chatllm-inference
  • mlabonne/llm-course

    mlabonne/llm-course

    75,340GitHubView on GitHub↗

    This project is a comprehensive educational curriculum and engineering handbook focused on the lifecycle of large language models. It serves as a structured knowledge base for machine learning practitioners, covering the fundamental mathematical and architectural principles of transformer-based sequence modeling, as we

    Implements efficient attention mechanisms and optimization strategies to maximize inference throughput.

    courselarge-language-modelsllm
  • dair-ai/Prompt-Engineering-Guide

    dair-ai/Prompt-Engineering-Guide

    70,526GitHubView on GitHub↗

    This project is a comprehensive educational resource and knowledge base dedicated to the development and application of large language models and autonomous agentic systems. It provides a structured framework for understanding prompt engineering, context management, and the architectural patterns required to build task

    Demonstrates essential setup procedures for connecting to and configuring external language model providers.

    MDXagentagentsai-agents

Explore sub-tags

  • Decoding StrategiesAlgorithms that control how models select the next token in a sequence, such as sampling or beam search.
  • Hardware AcceleratorsSupport for specialized hardware for model inference.
  • Inference Optimization TechniquesMethods to improve the speed, latency, and resource efficiency of model inference.
Large Language Model Configurations
Documentation and configuration settings used to define parameters and operational behavior for large language models.
  • Model ConfigurationUtilities and settings for authenticating and connecting to external AI model providers.
  • Model Configuration InterfacesGraphical or programmatic interfaces used to adjust model parameters and fine-tune inference behavior.