30 open-source projects similar to openmathlib/openblas, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best OpenBLAS alternative.
OpenBLAS is a high-performance library for basic linear algebra subprograms that provides optimized matrix and vector operations. It serves as a multi-architecture math backend and numerical computing framework designed to execute complex mathematical calculations and high-speed numerical analysis. The library functions as an optimized CPU math library that detects hardware at runtime to apply the most efficient operation kernels for the specific processor. It supports multiple CPU targets through a combination of optimized assembly and C implementations. The project covers high-performance
ndarray is a multidimensional array library for Rust that serves as a linear algebra framework and scientific computing tool. It provides the core infrastructure for creating and manipulating n-dimensional arrays, functioning as both a parallel array processor and a toolkit for numerical data analysis. The library distinguishes itself by providing efficient slicing and memory views, allowing for data sharing without copying. It leverages optimized backend math libraries for high-speed matrix multiplication and distributes heavy mathematical iterations across multiple CPU threads to accelerate
This project is a collection of foundational machine learning algorithms and data science tools implemented in Python. It focuses on building the logic of these tools using basic programming primitives rather than relying on specialized libraries. The implementation covers several core domains, including a linear algebra library for matrix and vector operations, a statistical analysis toolkit for probability and hypothesis testing, and a framework for map-reduce distributed processing. It also includes implementations for natural language processing, graph theory for network analysis, and var
NumPy is a foundational library for scientific computing in Python, providing a comprehensive framework for managing and manipulating large-scale numerical information. It centers on high-performance multidimensional array objects that serve as the primary data structure for complex mathematical operations and data analysis workflows. The library distinguishes itself through specialized mechanisms for handling multidimensional data, including advanced indexing, slicing, and broadcasting techniques that allow for efficient operations across arrays of varying shapes. It utilizes strided metadat
This project is a structured learning curriculum and technical reference for mastering deep learning with TensorFlow. It provides a comprehensive guide for building, training, and deploying neural networks, combining theoretical fundamentals with practical implementation examples. The repository distinguishes itself by covering the end-to-end machine learning workflow, from low-level tensor mathematics and linear algebra to the creation of complex model architectures. It includes specific guidance on developing data pipelines for diverse data types, such as images, text, and time-series seque
Surge is a Swift library for high-performance numerical analysis, linear algebra, digital signal processing, and accelerated image manipulation. It utilizes the Accelerate framework to provide hardware-accelerated tools for matrix mathematics and signal processing. The library provides specialized capabilities for digital signal processing, including convolution, signal similarity analysis through cross-correlation, and domain transformations using fast Fourier transforms. It also includes a suite of tools for the rapid transformation and analysis of pixel buffers and image data. Beyond sign
LAPACK is a comprehensive library of Fortran routines designed for high-performance numerical analysis and linear algebra. It serves as a foundational scientific computing framework, providing standardized procedures for solving systems of linear equations, eigenvalue problems, and least squares approximations. The library distinguishes itself through a hierarchical routine abstraction that organizes mathematical operations into distinct levels of complexity. It utilizes block-partitioned matrix algorithms and a column-major memory layout to optimize data locality and hardware efficiency. By
xtensor is a C++ multidimensional array library for numerical computing that provides N-dimensional containers with an interface mirroring the NumPy API. It utilizes a lazy evaluation expression engine to defer numerical computations until assignment, which minimizes memory allocations and intermediate copies. The library features a foreign memory array adaptor that allows it to wrap external buffers, such as NumPy arrays, to perform numerical operations in-place without duplicating data. It further optimizes performance through lazy broadcasting and a system that manages the lifetime of temp
Magnum is a C++ middleware suite for cross-platform graphics development and real-time data visualization. It provides a hardware-agnostic rendering layer that translates graphics commands into platform-specific calls, ensuring consistent behavior across different GPU drivers and APIs such as Vulkan. The project focuses on decoupling application logic from underlying hardware through abstract graphics and system utilities. It features a plugin-based resource importer for 3D assets and audio, a hierarchical scene graph for spatial transformations, and a high-performance signal-based event syst
libigl is a C++ geometry processing library used for analyzing and manipulating 3D triangle and tetrahedral meshes. It functions as a numerical linear algebra suite and a mesh manipulation framework, integrating a geometric deformation engine to implement rigid and polyharmonic transformations. The project is distinguished by its header-only library design and its implementation of specialized deformation techniques, including rigid-as-possible and polyharmonic shape deformation. It also provides a visualization tool for rendering surfaces and scalar fields with interactive scene controls and
ArrayFire is a hardware-agnostic compute framework and JIT-compiled tensor engine designed for high-performance numerical computing. It serves as a GPU numerical computing library and parallel signal processing toolkit that abstracts hardware backends, allowing the same codebase to execute across various GPU architectures and CPUs. The project distinguishes itself through a JIT engine that uses expression compilation to fuse operations and minimize memory overhead. It employs a deferred execution graph to optimize computation chains and provides interoperability primitives to share data and e
Math.js is a comprehensive JavaScript library for scientific, complex, and arbitrary precision calculations. It functions as a symbolic computation engine, a linear algebra toolkit, a statistical analysis library, and a unit conversion system. The project distinguishes itself by providing a symbolic engine capable of parsing, simplifying, and manipulating mathematical expressions algebraically without requiring immediate numerical evaluation. It includes a framework for defining and converting physical quantities with units of measure and automatic prefix support. The library covers a broad
Linfa is a classical machine learning framework and statistical learning suite implemented in Rust. It provides a collection of algorithms for supervised and unsupervised learning, focused on traditional statistical methods such as regression, clustering, and decision trees. The toolkit is distinguished by its ability to be compiled into WebAssembly, enabling analytical models to execute within browser environments. It employs a trait-based algorithm interface to standardize the process of training and prediction across its various models. The library covers a broad range of capabilities, in
This project is an educational codebase and reference library that translates theoretical deep learning concepts into executable PyTorch code. It serves as a practical implementation of a deep learning textbook, providing a course-like structure of guided exercises and architectural examples for learning purposes. The repository includes a library of standard neural network architectures, including linear, convolutional, recurrent, and transformer models. It specifically implements a variety of deep learning patterns such as multilayer perceptrons, VGG networks, gated recurrent units, and lon
This project is a comprehensive library for numerical linear algebra and scientific computing, designed to provide optimized routines for matrix decomposition, statistical modeling, and high-performance data analysis. It serves as both a toolkit for solving complex linear systems and an educational resource for understanding the fundamental algorithms behind matrix factorizations and numerical solvers. The library distinguishes itself through a focus on randomized numerical linear algebra, utilizing probabilistic algorithms and approximate methods to perform dimensionality reduction and matri
Highway is a portable C++ library and hardware abstraction layer designed for writing single instruction multiple data (SIMD) code. It provides a unified interface that maps data-parallel logic to various CPU instruction sets, enabling the development of high-performance software that runs across different processor architectures without requiring architecture-specific assembly. The project features a dynamic instruction dispatcher that selects the most efficient CPU instruction set at runtime based on detected hardware. It also supports static target specialization and extensible mechanisms
This project is a set of exercise solutions and implementation guides for visual simultaneous localization and mapping. It provides a collection of worked code examples and mathematical solutions designed to translate theoretical localization and mapping concepts into practical implementations. The repository serves as a technical companion for academic study, featuring worked answers to SLAM exercises and a system for tracking typographical and technical corrections to maintain the accuracy of the associated written work. The codebase covers spatial mathematics and robotics geometry, includ
ISPC is a vectorizing compiler and SIMD parallel programming language that implements a single program multiple data model. It serves as a toolchain for translating C-based code with parallel extensions into optimized machine code for various CPU and GPU architectures using an LLVM backend. The compiler is designed for cross-platform SIMD toolchain support, generating specialized instruction sets for x86 SSE/AVX, ARM NEON, and Intel GPU from a single source. It features a runtime dispatch mechanism that selects the most efficient hardware-specific implementation for the current system during
jetson-inference is a set of libraries and tools for executing optimized deep learning models on embedded GPU hardware. Its primary purpose is to enable real-time computer vision and AI inference at the edge with low latency and high throughput. The project distinguishes itself through high-performance streaming analytics and the ability to execute concurrent AI pipelines on auto-grade silicon. It provides specialized support for multi-sensor stream processing, utilizing zero-copy data transport to load camera frames directly into GPU memory. The codebase covers a broad surface of capabiliti
This project is a game decompilation project consisting of a reconstructed C source codebase and the systems used for binary reconstruction. It provides a human-readable version of a commercial game title created through static and dynamic analysis to facilitate technical study and modification. The project utilizes a containerized build environment to ensure reproducible compilation and consistent toolchain versions across different host operating systems. It includes a game binary reconstructor that translates original machine code into source code and a system for compiling the codebase in
oneDNN is a library for deep learning acceleration that provides optimized building blocks for neural network training and inference. It manages tensor computation across CPU and GPU hardware, enabling the execution of high-performance primitives for model training and neural network inference optimization. The project distinguishes itself through hardware-specific kernel optimization and the use of just-in-time compilation to target specific processor instruction sets. It supports quantized neural network execution using both static and dynamic quantization to reduce memory usage and increas
Asterinas is a memory-safe operating system kernel designed to prevent data races and memory corruption. It functions as a Linux-ABI compatible kernel, enabling the execution of existing Linux binaries and container workloads while providing a declarative operating system distribution model. The project distinguishes itself by acting as a virtual machine container host and a confidential computing guest OS, allowing it to run within hardware-isolated Trusted Execution Environments such as Intel TDX. It implements a minimal trusted computing base by isolating unsafe low-level operations and se
seL4 is a formally verified microkernel whose C implementation is backed by machine-checked mathematical proofs of correctness, confidentiality, integrity, and availability. It enforces strict isolation between processes through hardware-enforced address space separation and a capability-based access control system, where each process holds explicit rights only to the resources it has been granted. The kernel exposes hardware resources through a minimal API of system calls that manage threads, address spaces, and inter-process communication, with synchronous IPC supporting sender-identifying b
SciPy is a scientific computing library for Python that provides a comprehensive collection of mathematical algorithms and numerical tools for research and engineering. It functions as a high-performance numerical analysis framework, bridging high-level Python code with compiled C and Fortran routines to execute complex computations at hardware speeds. The library is built upon array-based data structures that utilize strided memory layouts to enable efficient data manipulation and slicing. By employing vectorized operation dispatch and linking to optimized hardware-specific linear algebra li
This project is an open source Linux GPU kernel driver implemented as a loadable kernel module. It functions as a GPU firmware loader, providing the low-level driver services necessary to enable direct communication between the operating system and graphics processing units. The driver utilizes a dual-module architecture that separates GPL-licensed kernel code from proprietary firmware blobs. This system extracts and links signed binary firmware images into the kernel modules at driver load time. The project provides driver support for Turing-architecture GPUs and all subsequent newer hardwa
nalgebra is a linear algebra library for Rust that provides matrix and vector operations with support for both compile-time and runtime dimensions. It functions as a numerical analysis library and a sparse matrix library, offering a mathematical framework capable of running in embedded environments and WebAssembly without requiring the Rust standard library. The project distinguishes itself as a geometric transformation library, utilizing homogeneous coordinates, quaternions, and isometries to handle 3D rotations, translations, and projections. It implements a variety of matrix decompositions
This project is a numerical computing library designed for scientific and engineering mathematical operations. It functions as a comprehensive linear algebra framework, a statistical analysis library, and a toolkit for mathematical optimization and numerical integration. The library is distinguished by its provider-based native acceleration, which allows managed code to be swapped for platform-native binary libraries to increase the performance of computationally intensive routines. It also supports a hybrid approach to matrix storage, implementing separate strategies for dense and sparse mat
Smile is a comprehensive JVM machine learning library and statistical computing toolkit. It provides a suite of algorithms for classification, regression, and clustering, implemented natively for Java, Scala, and Kotlin. The project also functions as a deep learning framework, a natural language processing library, and an inference engine for large language models. The library distinguishes itself through GPU acceleration via LibTorch bindings and support for the ONNX model interchange format. It includes specialized capabilities for large language model inference, featuring Byte-Pair Encodin
Deeplearning4j is a JVM-based deep learning framework and tensor computing library. It provides a computational graph engine for defining and executing deep learning workflows and mathematical operations within the Java Virtual Machine. The project includes a dedicated importer for loading and running pretrained models exported from Keras, TensorFlow, and ONNX formats. Its tensor computing capabilities are driven by a modular native C++ math core to execute high-performance linear algebra operations. The framework covers neural network training, deep learning model inference, and the constru
Gonum is a numerical computing library for the Go programming language, providing a collection of packages for scientific computing, linear algebra, statistics, and optimization. It functions as a framework for performing complex numerical computations and solving systems of linear equations. The project includes a dedicated graph analysis framework for modeling network graphs and solving connectivity and pathfinding problems. It also provides a statistical analysis toolkit for computing descriptive and inferential statistics and estimating mixture entropy. The library's capability surface c