c3c is the compiler for the C3 programming language, transforming source code into executable binaries, static libraries, or dynamic libraries using an LLVM backend. It implements a system based on result-based error handling, scoped memory pooling, and a semantic macro system. The compiler provides first-class support for hardware-backed SIMD vectors that map directly to processor instructions and enables runtime polymorphism through interface-based dynamic dispatch. The project covers a broad set of low-level capabilities, including manual and pooled memory management, inline assembly inte
ISPC is a vectorizing compiler and SIMD parallel programming language that implements a single program multiple data model. It serves as a toolchain for translating C-based code with parallel extensions into optimized machine code for various CPU and GPU architectures using an LLVM backend. The compiler is designed for cross-platform SIMD toolchain support, generating specialized instruction sets for x86 SSE/AVX, ARM NEON, and Intel GPU from a single source. It features a runtime dispatch mechanism that selects the most efficient hardware-specific implementation for the current system during
OpenBLAS is a high-performance implementation of the Basic Linear Algebra Subprograms standard designed for numerical computing and matrix operations. It serves as a hardware-accelerated numerical library and optimized math kernel library, providing a computational engine for large-scale matrix multiplication and vector operations. The library distinguishes itself through the use of hand-tuned assembly kernels and SIMD instruction mapping, such as AVX and SVE, to maximize floating-point performance on specific CPU architectures. It features a multi-threaded framework that manages parallel exe
ZLUDA is a middleware and translation engine designed to enable the execution of unmodified proprietary compute binaries on non-native graphics hardware. It functions as a compatibility layer that bridges vendor-specific compute interfaces with open standards, allowing software originally restricted to a single hardware ecosystem to operate on alternative graphics processing units. The project achieves this through a combination of dynamic library interception and runtime instruction translation. By replacing standard system libraries and mapping proprietary compute calls to open standards, t