awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Serving & Runtime · Awesome GitHub Repositories

8 repos

Awesome GitHub RepositoriesServing & Runtime

Explore 8 awesome GitHub repositories matching artificial intelligence & ml · Serving & Runtime. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Machine Learning
  4. Infrastructure
  5. Optimization & Inference
  6. Serving & Runtime

Awesome Serving & Runtime GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • huggingface/transformers

    huggingface/transformers

    156,730GitHubView on GitHub↗

    Transformers is a comprehensive library for machine learning that provides a unified interface for training, fine-tuning, and deploying transformer-based models. It supports a wide range of tasks, including text classification, language modeling, question answering, and sequence-to-sequence translation, while offering

    Optimizes memory usage and inference speed through automatic device mapping and half-precision weight support.

    Pythonaudiodeep-learningdeepseek
  • Shubhamsaboo/awesome-llm-apps

    Shubhamsaboo/awesome-llm-apps

    96,116GitHubView on GitHub↗

    This repository serves as a comprehensive collection of resources, templates, and starter code for building artificial intelligence applications. It provides a centralized hub for developers to access practical implementations of common workflows, including retrieval-augmented generation pipelines and autonomous agent

    Utilities and techniques help reduce token consumption and operational costs while preserving output quality.

    Pythonagentsllmspython
  • ggml-org/llama.cpp

    ggml-org/llama.cpp

    95,400GitHubView on GitHub↗

    Llama.cpp is an inference engine designed for the local execution of text-based and multimodal language models on consumer hardware. It provides a core environment for running models that process both text and image inputs, utilizing hardware-accelerated backends to optimize performance across diverse CPU and GPU archi

    Compresses model weights into quantized formats to significantly reduce memory footprint and boost inference speed.

    C++ggml
  • fighting41love/funNLP

    fighting41love/funNLP

    78,999GitHubView on GitHub↗

    This project is a community-driven knowledge base and curated repository focused on natural language processing and large language model development. It serves as a centralized index for high-quality tools, libraries, and research materials, organizing technical resources into structured, version-controlled documentati

    Highlights efficient training and inference techniques designed to run massive models on hardware with constrained resources.

    Python
  • keras-team/keras

    keras-team/keras

    63,858GitHubView on GitHub↗

    Keras is a high-level deep learning framework designed for constructing and training neural networks through the composition of modular, functional layers. It serves as a comprehensive modeling toolkit that provides standardized procedures for defining, evaluating, and deploying complex architectures. By utilizing a di

    Applies hardware-specific tuning to model execution paths, significantly enhancing inference speed and throughput on diverse computing devices.

    Pythondata-sciencedeep-learningjax
  • ultralytics/yolov5

    ultralytics/yolov5

    56,830GitHubView on GitHub↗

    YOLOv5 is a comprehensive computer vision framework designed for end-to-end deep learning, specializing in real-time object detection, image classification, and instance segmentation. It provides a unified toolkit that manages the entire lifecycle of a model, from initial dataset configuration and hyperparameter tuning

    Translates trained models into standard industry formats to ensure compatibility across diverse hardware and deployment environments.

    Pythoncoremldeep-learningios
  • deepfakes/faceswap

    deepfakes/faceswap

    54,974GitHubView on GitHub↗

    Faceswap is a comprehensive framework for automated media manipulation and neural face synthesis. It provides a modular pipeline that manages the entire lifecycle of facial feature extraction, deep learning model training, and image conversion. By coordinating complex computer vision workflows, the system enables users

    Converts trained models into inference-ready versions by calculating required layers and configuring swap parameters.

    Pythondeep-face-swapdeep-learningdeep-neural-networks
  • unslothai/unsloth

    unslothai/unsloth

    52,461GitHubView on GitHub↗

    Unsloth is a high-performance training and inference platform designed to optimize the lifecycle of large language and multimodal models. It provides a comprehensive engine for fine-tuning, executing, and managing models locally, with a focus on reducing memory consumption and increasing compute speed on consumer-grade

    Exports custom model weights into standard file formats to ensure compatibility with local inference and production systems.

    Pythonagentdeepseekdeepseek-r1

Explore sub-tags

  • Inference Optimization Utilities3 sub-tagsTools focused on post-training conversion, compilation, and hardware-specific acceleration for deployment-ready models.
  • Inference Optimizations3 sub-tagsTechniques and mechanisms designed to reduce latency and increase throughput during the model inference phase.
  • Large Language Model Optimization2 sub-tagsMethods and utilities specifically engineered to improve the speed and efficiency of large language model operations.
Model Quantization ToolsUtilities that reduce the precision of model weights to decrease memory usage and accelerate inference speeds.