30 open-source projects similar to ffmpeg/asm-lessons, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Asm Lessons alternative.
This project is an ARMv8 assembly programming guide and tutorial designed to teach the translation of high-level logic into low-level machine instructions. It serves as a low-level systems programming reference for writing and executing code specifically for the ARMv8 architecture. The resource provides instructions for cross-platform assembly implementation, focusing on unifying symbol naming and memory addressing to ensure source code builds across different operating systems. It also covers the integration of assembly routines with higher-level languages using standardized calling conventi
This project is a comprehensive technical guide to advanced Go programming. It serves as a programming guide, technical reference, and textbook covering low-level optimization and distributed systems architecture. The resource provides detailed instructions on writing assembly instructions to optimize performance-critical code and managing C-Go interoperability to bridge C libraries with Go. It also functions as a manual for implementing remote procedure call mechanisms and creating custom plugins for the Protocol Buffers compiler. The material covers high-level capabilities including the de
This project is an educational resource and technical reference for building operating systems from scratch. It provides a comprehensive guide to mastering x86 architecture and implementing core kernel components by writing code that executes directly on hardware without the support of standard libraries or operating system abstractions. The materials focus on low-level systems engineering, teaching users how to interpret technical datasheets to manage hardware resources. It covers the fundamental mechanics of bare-metal programming, including the use of assembly language to define execution
xxHash is a high-performance, non-cryptographic hash library designed for rapid checksum generation and data integrity verification. It functions as an incremental hashing engine, allowing for the processing of large or streaming data inputs by maintaining a persistent internal state across sequential chunks. The library is engineered as a computational framework that maximizes throughput by utilizing wide CPU registers and branchless instruction pipelining. It achieves high-speed performance by aligning data access with CPU cache lines and employing multi-stage mixing functions that ensure c
HelloSilicon is a programming guide and tutorial for writing and debugging low-level 64-bit assembly code specifically for Apple Silicon processors. It serves as an architecture reference for interacting with macOS and iOS kernel services using system calls and hardware registers. The project provides specialized instruction on foreign function interfacing to bridge assembly with high-level languages like C or Python. It also includes a toolkit for configuring Mach-O binaries and compiling universal binaries and dynamic libraries for cross-hardware compatibility. The material covers low-leve
This project is a comprehensive educational framework designed to guide learners through the complexities of systems engineering and low-level software development. It provides structured learning paths that integrate hardware simulation, source code analysis, and project-based exercises to help developers master the foundational concepts of computer architecture, operating systems, and firmware design. The curriculum distinguishes itself by emphasizing direct interaction with system internals, requiring learners to examine and modify existing open-source kernel and driver implementations. By
Bend is a high-level parallel programming language and compiler designed to execute code across multi-core CPUs and GPUs automatically. By translating functional source code into a graph-based intermediate representation, it enables massive parallel execution without requiring manual management of threads, locks, or atomic operations. The runtime operates as an interaction net engine, where computations are represented as networks of nodes that reduce through local rewriting rules. This model utilizes a work-stealing scheduler to distribute tasks across thousands of hardware threads, ensuring
This project is a comprehensive, curated directory of high-quality libraries, tools, and educational resources for C and C++ development. It serves as an ecosystem discovery index, helping developers navigate the vast landscape of third-party components, frameworks, and technical documentation available for the language. The collection is distinguished by its focus on high-performance systems programming and technical mastery. It provides deep coverage of specialized domains including SIMD-accelerated data processing, compile-time template metaprogramming, and asynchronous event-driven archit
This project is an open-source software engineering handbook and technical learning resource focused on backend web development. It provides a comprehensive guide to building server-side applications, covering the end-to-end flow of web requests from initial HTTP traffic handling to database integration and dynamic content rendering. The material follows a code-centric pedagogical pattern, anchoring theoretical concepts in functional snippets that demonstrate practical implementation. The curriculum is organized through progressive complexity sequencing, moving from foundational language synt
This project is an interactive data science environment that combines code execution, rich media visualization, and narrative documentation into a persistent, browser-based platform. It serves as a comprehensive educational resource for scientific computing, providing a framework for iterative data analysis and machine learning prototyping. The environment is distinguished by its focus on high-performance numerical computing, utilizing vectorized array operations and memory-mapped data structures to handle large-scale computations efficiently. It features a unified estimator interface that st
Lua is an embeddable scripting language written in ISO C, designed to be integrated into host applications for runtime customization. It provides a C-based scripting engine and a prototype-based object model that utilizes associative arrays and metatables to implement inheritance and complex data structures. The language features a cooperative multitasking system that manages concurrent execution threads via coroutines and an incremental garbage collector for automatic memory management. It includes a safe code sandbox to isolate global state and run untrusted scripts within a protected envir
PlatformIO Core is a toolset for embedded software development that manages the compilation, flashing, and debugging of firmware for various microcontroller targets. It provides a cross-platform build system that automates the process of transforming source code into binaries and transferring them to hardware via serial protocols. The system uses a plugin-based architecture to extend hardware platform support and incorporates a manifest-driven approach to resolve and install the specific toolchains, frameworks, and libraries required for different board definitions. Capabilities cover the fu
This project serves as a comprehensive educational resource for learning parallel programming and high-performance computing using graphics processing units. It provides technical guidance on the fundamental paradigms required to offload computationally intensive tasks from a host system to specialized hardware accelerators. The materials cover the core methodologies for managing data-parallel operations, including the orchestration of memory between host and device spaces and the organization of threads into structured grids and blocks. It details the execution models necessary to distribute
RIOT is a real-time operating system designed for resource-constrained microcontrollers. It provides a kernel for managing hardware peripherals, memory, and multitasking on embedded devices, featuring a microcontroller hardware abstraction layer to unify hardware access across different chipsets. The system employs a preemptive tickless task scheduler with priority-based execution to maximize energy efficiency in battery-powered hardware. It also includes an embedded security framework consisting of cryptographic APIs and secure transport protocols to facilitate authenticated over-the-air fir
This project is a collection of educational resources and technical guides focused on Go performance optimization. It provides instruction on improving execution speed and reducing memory usage through code and architectural refinements. The guides cover advanced strategies for low-level programming, including the use of assembly for SIMD instructions and unsafe pointers for direct memory manipulation. It also details concurrency optimization techniques such as lock sharding and cache-line padding to reduce contention and improve hardware utilization. The material encompasses broad capabilit
OffensiveRust is a red team toolkit and malware development kit written in Rust. It serves as an evasion framework and post-exploitation library, providing a collection of offensive security primitives and a Windows API wrapper for interacting with low-level system functions and undocumented APIs. The project focuses on bypassing security software through direct system calls, memory obfuscation, and stealthy payload execution. It implements techniques to defeat static binary analysis via compile-time string encryption and payload obfuscation, while avoiding detection using parent process ID s
Red is a programming language with a native compiler that translates high-level source code into standalone executables for Windows, macOS, and Linux without external runtime dependencies. It combines a cross-platform GUI development framework that renders native operating system widgets from a single codebase with a reactive data binding system that automatically synchronizes UI state with data sources. The language also includes an embedded DSL and parsing engine based on PEG grammar rules for defining and processing domain-specific languages within the language itself. The project distingu
BlocksKit is a low-level utility library for Apple platform development, specifically designed for managing the execution flow and memory of blocks within macOS and iOS applications. It provides a collection of helper functions to simplify the use of blocks in Objective-C and C, reducing boilerplate code and addressing inherent technical limitations. The library focuses on bridging Objective-C blocks with legacy C-based APIs by providing compatible wrapper structures and function-pointer emulation. It enables the passing of blocks through system interfaces that require strict C-style callback
This project is a structured educational resource designed to guide developers through the mastery of the JavaScript programming language. It utilizes a progressive curriculum that organizes technical concepts into a daily learning path, allowing students to build foundational knowledge before advancing to complex application development. The resource distinguishes itself through a hands-on training model that combines detailed explanations with practical code challenges. By focusing on an interactive learning experience, it reinforces core language principles—such as data types, functional p
TinyGo is a specialized compiler and development toolkit designed to bring the Go programming language to resource-constrained microcontrollers and WebAssembly environments. It provides a bare-metal runtime environment that enables high-level code execution without the need for a traditional operating system, utilizing an LLVM-based backend to generate efficient machine instructions. The project distinguishes itself through aggressive optimization techniques tailored for small hardware, including a static memory allocation strategy and whole-program dead code elimination that significantly re
Surge is a Swift library for high-performance numerical analysis, linear algebra, digital signal processing, and accelerated image manipulation. It utilizes the Accelerate framework to provide hardware-accelerated tools for matrix mathematics and signal processing. The library provides specialized capabilities for digital signal processing, including convolution, signal similarity analysis through cross-correlation, and domain transformations using fast Fourier transforms. It also includes a suite of tools for the rapid transformation and analysis of pixel buffers and image data. Beyond sign
HVM2 is a high-performance execution environment for pure functional programs, implemented as a systems-level runtime in Rust. It functions as a massively parallel functional runtime that uses interaction combinators to achieve automatic parallelism across multi-core CPUs and GPUs. The project distinguishes itself by using a graph-rewriting computational model to execute programs via local reduction rules, which eliminates the need for manual locks or atomic operations. It employs beta-optimal reduction and lazy evaluation to optimize higher-order functions and eliminate redundant computation
Napajs is an embeddable JavaScript engine and multi-threaded runtime designed to be integrated directly into other software applications as a component. It serves as a parallel computation framework that allows JavaScript code to execute across multiple threads, bypassing the standard single-threaded event loop limitation to handle CPU-intensive tasks. The runtime is distinguished by its ability to load and execute modules from the NPM ecosystem and its pluggable execution environment. This architecture allows for custom implementations of memory allocation, system logging, and performance me
AISystem is a comprehensive AI full-stack infrastructure project covering the entire pipeline from AI chip architecture to high-level training frameworks. It encompasses the development of AI compiler frameworks, inference engines, and distributed training orchestrators designed to coordinate workloads across a heterogeneous compute stack of CPUs, GPUs, and NPUs. The project focuses on the deep integration of software and hardware, employing software-hardware co-design to align tensor layouts with physical memory structures. It provides specialized capabilities for accelerating Transformer mo
Highway is a portable C++ library and hardware abstraction layer designed for writing single instruction multiple data (SIMD) code. It provides a unified interface that maps data-parallel logic to various CPU instruction sets, enabling the development of high-performance software that runs across different processor architectures without requiring architecture-specific assembly. The project features a dynamic instruction dispatcher that selects the most efficient CPU instruction set at runtime based on detected hardware. It also supports static target specialization and extensible mechanisms
This library provides a collection of low-level abstractions for interacting with hardware peripherals on Raspberry Pi devices using the Rust programming language. It serves as a type-safe interface for controlling physical pins and managing communication with external electronic components and sensors. The project distinguishes itself through its use of compile-time abstractions that map high-level function calls directly to hardware instructions, ensuring minimal runtime overhead. It provides consistent access to hardware by wrapping kernel-level device interfaces and memory-mapped register
h2o-3 is a distributed machine learning platform and automated machine learning framework designed for training and deploying predictive models using distributed in-memory computing. It functions as a deep learning framework and a distributed model scoring engine, capable of operating as a Kubernetes ML cluster to process large datasets in parallel. The platform distinguishes itself through automated machine learning capabilities that automatically select the best algorithms and hyperparameters to optimize model performance. It provides specialized deep learning toolkits for tasks including i
mctx is a framework for executing high-performance tree search and state simulations to generate policy targets for neural networks. It functions as a compiled search engine and neural dynamics simulator that predicts state transitions and rewards using learned representations. The project implements a vectorised tree search capable of running parallel search operations across input batches. It utilizes a policy target generator to convert search results into action weights used for training and refining neural network policies. The system covers reinforcement learning workflows by integrati
Open-smartwatch-os is an operating system designed for resource-constrained wearable devices. It provides a development and testing environment that allows developers to build, validate, and debug firmware for smartwatches within a simulated desktop environment before deploying to physical hardware. The project distinguishes itself through a comprehensive suite of tools for hardware emulation and remote diagnostics. By utilizing a hardware abstraction layer and modular driver architecture, it decouples core system logic from specific physical components. This enables developers to run the ope
This repository is a collection of reference implementations and programming examples for the CUDA Toolkit. It serves as a GPGPU implementation guide and a parallel computing reference, providing code for using graphics hardware to perform general-purpose calculations and high-performance parallel processing. The project provides specific samples for GPU kernel development and resource management. These include demonstrations of multi-GPU communication, peer-to-peer memory access, and system hardware inspection to coordinate distributed GPU resources. The codebase covers a wide range of capa