7 مستودعات
Direct programming of hardware using SIMD and inline assembly for maximum execution speed.
Distinct from Hardware Performance Tuning: Candidates focus on gaming or deep learning; this is general-purpose systems programming for maximum speed.
Explore 7 awesome GitHub repositories matching operating systems & systems programming · Hardware-Level Performance Tuning. Refine with filters or upvote what's useful.
ds4 is a local inference engine for DeepSeek models that includes a distributed runtime for splitting transformer layers across networked computers. It functions as a reasoning controller with a local weight streamer and an API server that streams chat completions via industry standard endpoints. The system employs a memory management model that loads model experts from disk on demand to execute models that exceed available system RAM. It provides controls for reasoning effort and model behavior steering, allowing the modification of response characteristics through activation directions. Th
Measures tokens per second and adjusts GPU power limits to optimize inference and manage heat.
Nvidia Profile Inspector is a Windows utility for modifying documented and hidden NVIDIA driver settings via the Driver Settings API. It functions as a GPU driver configuration tool that allows for the adjustment of global and per-application graphics settings to optimize performance and visual quality. The project features a specialized binary bitmask editor for precisely adjusting complex driver configurations and a dedicated utility for tuning Deep Learning Super Sampling, Ray Reconstruction, and Frame Generation settings. It also includes a driver profile backup tool to export and import
Adjusts advanced driver parameters and DLSS settings to balance visual quality and frame rates.
Waterfox is a privacy-focused web browser built on a fork of the Gecko engine that removes all telemetry and tracking code while preserving full extension compatibility. It encrypts DNS queries through independent third-party resolvers to prevent centralized monitoring of browsing destinations, and organizes browser tabs into hierarchical parent-child trees with collapsible branches and keyboard-driven navigation. The browser maintains a backward-compatible runtime bridge that supports both legacy XUL-based add-ons and modern WebExtensions simultaneously, allowing users to keep using older or
Optimizes graphics and compute workloads for modern CPU and GPU capabilities without vendor-specific telemetry.
c3c is the compiler for the C3 programming language, transforming source code into executable binaries, static libraries, or dynamic libraries using an LLVM backend. It implements a system based on result-based error handling, scoped memory pooling, and a semantic macro system. The compiler provides first-class support for hardware-backed SIMD vectors that map directly to processor instructions and enables runtime polymorphism through interface-based dynamic dispatch. The project covers a broad set of low-level capabilities, including manual and pooled memory management, inline assembly inte
Provides SIMD vector support and inline assembly to program hardware directly for maximum speed.
This project is a technical curriculum and set of educational resources focused on parallel programming, high-performance computing, and systems programming. It provides a structured course covering the implementation of parallel algorithms and multithreading techniques for processing large datasets. The project includes a systems programming guide for modern language features, a framework for lock-free concurrency patterns, and a manual for optimizing CPU and GPU performance through assembly analysis and cache management. The material covers hardware performance tuning, the implementation o
Employs SIMD and inline assembly to achieve maximum hardware execution speed.
LACT is an AMD GPU control suite and performance tuner for Linux systems. It functions as a GPU hardware monitor and management interface designed to optimize VRAM and processor frequencies through offsets and custom power profiles. The project utilizes a client-server architecture, allowing a graphical interface to monitor and control graphics hardware over a network connection via a background daemon. This remote management capability enables users to manage hardware on distant machines from a centralized interface. The suite covers hardware tuning for overclocking and undervolting, power
Enables precise adjustment of GPU clock speeds, voltages, and power limits for performance and stability.
Universal-x86-Tuning-Utility is a system tuning tool for x86 hardware that adjusts CPU, GPU, and memory settings to optimize performance and power consumption. It provides an adaptive power optimization algorithm that dynamically adjusts processor power limits based on real-time temperature monitoring, balancing performance with thermal safety margins. The utility also includes a hardware specification viewer that displays detailed system information for reference. The tool distinguishes itself through event-driven profile automation, which applies pre-configured tuning profiles automatically
Modifies graphics card clock speeds, voltages, and power targets to balance speed and thermal output.