awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Hardware-Accelerated Compute Backends · Awesome GitHub Repositories

6 repos

Awesome GitHub RepositoriesHardware-Accelerated Compute Backends

Optimized kernels and execution strategies for mapping operations onto GPUs and specialized silicon.

Explore 6 awesome GitHub repositories matching devops & infrastructure · Hardware-Accelerated Compute Backends. Refine with filters or upvote what's useful.

  1. Home
  2. DevOps & Infrastructure
  3. Infrastructure
  4. Application Compute Platforms
  5. Hardware-Accelerated Compute Backends

Awesome Hardware-Accelerated Compute Backends GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • pytorch/pytorch

    pytorch/pytorch

    97,601GitHubView on GitHub↗

    PyTorch is a machine learning framework centered on a GPU-ready tensor library that supports multi-dimensional array operations across both CPU and accelerator hardware. It provides a foundational infrastructure for mathematical computation and dynamic neural network construction, utilizing a tape-based automatic diffe

    Enables direct sharing of accelerator-resident memory buffers between separate processes.

    Pythonautograddeep-learninggpu
  • twitter/the-algorithm

    twitter/the-algorithm

    72,764GitHubView on GitHub↗

    The algorithm is a distributed recommendation engine pipeline designed to construct and serve personalized content timelines. It functions as a multi-stage orchestration layer that aggregates candidate content from diverse social graphs and high-dimensional embedding spaces, processing user interaction data to deliver

    Powers high-performance infrastructure for deploying and serving machine learning models within a recommendation pipeline.

    Scala
  • josephmisiti/awesome-machine-learning

    josephmisiti/awesome-machine-learning

    71,702GitHubView on GitHub↗

    This project is a comprehensive, community-driven directory of machine learning resources, software libraries, and educational materials. It serves as a centralized knowledge base for developers and researchers, organizing tools and frameworks by their primary programming language and technical domain to simplify disco

    Optimizes compute performance by mapping intensive mathematical operations onto GPU and specialized hardware backends.

    Python
  • vllm-project/vllm

    vllm-project/vllm

    70,745GitHubView on GitHub↗

    vLLM is a high-throughput inference engine designed for the efficient serving and execution of large language models. It functions as a production-ready distributed model server, providing standard API protocols for online serving while also supporting offline batch processing. The system is built to maximize token gen

    Maps complex mathematical operations onto diverse graphics processing units and specialized silicon using optimized kernels.

    Pythonamdblackwellcuda
  • ultralytics/yolov5

    ultralytics/yolov5

    56,830GitHubView on GitHub↗

    YOLOv5 is a comprehensive computer vision framework designed for end-to-end deep learning, specializing in real-time object detection, image classification, and instance segmentation. It provides a unified toolkit that manages the entire lifecycle of a model, from initial dataset configuration and hyperparameter tuning

    Minimizes memory footprint on resource-constrained hardware by disabling non-essential services and graphical interfaces.

    Pythoncoremldeep-learningios
  • tensorflow/tfjs-examples

    tensorflow/tfjs-examples

    6,783GitHubView on GitHub↗

    This repository provides a collection of practical demonstrations and implementation guides for machine learning tasks using TensorFlow.js. It serves as a resource for developers to explore model architectures, training workflows, and data manipulation techniques across domains such as computer vision, natural language

    Native binary acceleration optimizes linear algebra computations on the CPU across multiple operating systems.

    JavaScript

Explore sub-tags

  • CUDA Tensor SharingMechanisms for sharing accelerator memory across processes.
  • Kernel Fusion StrategiesTechniques for combining multiple operations into single execution units to reduce overhead.
  • ML ServingHigh-performance environments for deploying predictive models.
  • Memory Optimization Techniques
Strategies for minimizing memory usage in resource-constrained deployment environments.