30 open-source projects similar to arrayfire/arrayfire, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Arrayfire alternative.
xtensor is a C++ multidimensional array library for numerical computing that provides N-dimensional containers with an interface mirroring the NumPy API. It utilizes a lazy evaluation expression engine to defer numerical computations until assignment, which minimizes memory allocations and intermediate copies. The library features a foreign memory array adaptor that allows it to wrap external buffers, such as NumPy arrays, to perform numerical operations in-place without duplicating data. It further optimizes performance through lazy broadcasting and a system that manages the lifetime of temp
This project is an educational resource and a collection of instructional materials for performing data manipulation and statistical analysis using Python. It provides a comprehensive set of guides and code examples for using the Pandas, NumPy, and Matplotlib libraries to analyze structured data. The resource includes a dedicated guide for reshaping, cleaning, and aggregating tabular data and time series via Pandas, alongside a reference for high-performance vectorized operations and linear algebra using NumPy. It also features tutorials for creating publication-quality charts, distribution p
Torch7 is a scientific computing environment and tensor computation library used for deep learning research and numerical analysis. It functions as a Lua-based framework for training neural networks and learning agents, providing a toolkit for implementing architectures and training through reinforcement learning algorithms. The project is distinguished by its tight integration with C, utilizing a binding layer to map high-level scripting to low-level C structures for direct memory access. It supports hardware-accelerated computation by offloading linear algebra and convolution operations to
ndarray is a multidimensional array library for Rust that serves as a linear algebra framework and scientific computing tool. It provides the core infrastructure for creating and manipulating n-dimensional arrays, functioning as both a parallel array processor and a toolkit for numerical data analysis. The library distinguishes itself by providing efficient slicing and memory views, allowing for data sharing without copying. It leverages optimized backend math libraries for high-speed matrix multiplication and distributes heavy mathematical iterations across multiple CPU threads to accelerate
This repository is a comprehensive collection of instructional guides and practical examples for Python development, focusing on machine learning, data science, and web scraping. It provides implementations for neural networks, reinforcement learning algorithms, and deep learning architectures using PyTorch, alongside detailed manuals for scientific computing and data visualization. The project distinguishes itself by offering specialized tutorials on concurrent programming to optimize CPU performance and guides for setting up Linux development environments. It covers the implementation of ad
This project is a machine learning educational curriculum and learning platform delivered through interactive Jupyter Notebooks. It serves as a comprehensive guide for mastering the Python data science toolkit, providing structured tutorials for numerical computing, tabular data manipulation, and statistical visualization. The curriculum includes specific implementation guides for Scikit-Learn and a practical course on TensorFlow for constructing, training, and deploying neural networks and computer vision models. It covers the end-to-end process of building predictive models, from initial pr
Surge is a Swift library for high-performance numerical analysis, linear algebra, digital signal processing, and accelerated image manipulation. It utilizes the Accelerate framework to provide hardware-accelerated tools for matrix mathematics and signal processing. The library provides specialized capabilities for digital signal processing, including convolution, signal similarity analysis through cross-correlation, and domain transformations using fast Fourier transforms. It also includes a suite of tools for the rapid transformation and analysis of pixel buffers and image data. Beyond sign
This is an interactive notebook-based course that teaches machine learning from Python fundamentals through deep learning and natural language processing. It uses real datasets and multiple frameworks within a structured, hands-on curriculum that combines concise explanations with executable code cells, built-in datasets, and embedded exercise checkpoints. Learning progresses through data preparation and exploration, classical machine learning workflows, computer vision with convolutional neural networks, and natural language processing with deep learning, all delivered as a cohesive progressi
This project is a numerical computing library designed for scientific and engineering mathematical operations. It functions as a comprehensive linear algebra framework, a statistical analysis library, and a toolkit for mathematical optimization and numerical integration. The library is distinguished by its provider-based native acceleration, which allows managed code to be swapped for platform-native binary libraries to increase the performance of computationally intensive routines. It also supports a hybrid approach to matrix storage, implementing separate strategies for dense and sparse mat
jetson-inference is a set of libraries and tools for executing optimized deep learning models on embedded GPU hardware. Its primary purpose is to enable real-time computer vision and AI inference at the edge with low latency and high throughput. The project distinguishes itself through high-performance streaming analytics and the ability to execute concurrent AI pipelines on auto-grade silicon. It provides specialized support for multi-sensor stream processing, utilizing zero-copy data transport to load camera frames directly into GPU memory. The codebase covers a broad surface of capabiliti
NumCpp is a C++ framework and numerical computing library that provides a toolkit for multi-dimensional array management and mathematical routines. It functions as a C++ implementation of the NumPy ecosystem, offering a scientific computing framework for managing tensors and performing complex algebraic equations. The project enables high-performance array manipulation within a C++ environment without relying on a Python runtime. It distinguishes itself by providing a NumPy-like interface for executing linear algebra, managing multi-dimensional data structures, and performing numerical proces
Codon is an LLVM-based Python compiler and statically typed implementation that translates source code into optimized machine instructions. It functions as a high-performance numerical backend and a GPU computing framework designed to remove runtime overhead. The project implements a compiled alternative to NumPy, translating array logic directly into machine code. It differentiates itself by generating specialized hardware kernels for graphics processors and utilizing static type inference to enable aggressive machine-code optimization. The system provides capabilities for parallel workload
This project is a structured learning curriculum and technical reference for mastering deep learning with TensorFlow. It provides a comprehensive guide for building, training, and deploying neural networks, combining theoretical fundamentals with practical implementation examples. The repository distinguishes itself by covering the end-to-end machine learning workflow, from low-level tensor mathematics and linear algebra to the creation of complex model architectures. It includes specific guidance on developing data pipelines for diverse data types, such as images, text, and time-series seque
This project is a machine learning array framework and tensor computation library designed for high-performance numerical computing. It provides a comprehensive suite of tools for constructing and training neural networks, featuring an automatic differentiation engine that facilitates gradient-based optimization and complex mathematical modeling. The library distinguishes itself through a unified memory architecture that allows data to be shared across CPU and GPU devices without explicit copies, significantly reducing data movement overhead. Its execution model relies on a lazy evaluation en
MNN is a high-performance inference engine and framework designed for on-device machine learning. It provides a comprehensive environment for executing, optimizing, and deploying neural network models directly on mobile and resource-constrained edge devices. The framework distinguishes itself through a robust model optimization toolkit that supports quantization, compression, and structural graph manipulation to minimize memory footprint and maximize execution speed. It features a modular architecture that abstracts hardware-specific backends, allowing models to run efficiently across diverse
This repository serves as a comprehensive collection of reference implementations for the PyTorch machine learning library. It provides practical examples for building, training, and deploying deep learning models, functioning as a toolkit for developers to explore neural network architectures and training workflows. The project distinguishes itself by offering concrete demonstrations of complex machine learning operations, ranging from computer vision tasks like object detection and depth estimation to the training of large-scale transformer models. These examples illustrate how to implement
Dask is a parallel computing framework and distributed task scheduler designed to scale Python data science workflows from single machines to large clusters. It functions as a cluster resource manager that orchestrates computational logic by representing tasks and their dependencies as directed acyclic graphs. This architecture allows the system to automate the distribution of workloads across available hardware while managing complex execution requirements. The project distinguishes itself through a lazy evaluation engine that defers data operations until they are explicitly requested, enabl
Leaf is a machine learning framework and neural network architecture toolkit used for building, training, and deploying models. It functions as a hardware abstraction layer, mapping high-level computational graphs to low-level instructions across various CPU and GPU backends and operating systems. The system enables the design of flexible model structures through a modular architecture where reusable container layers encapsulate weights and mathematical operations. This allows for the composition of complex neural networks via nested components. The framework includes a data engineering pipe
OpenBLAS is a high-performance implementation of the Basic Linear Algebra Subprograms standard designed for numerical computing and matrix operations. It serves as a hardware-accelerated numerical library and optimized math kernel library, providing a computational engine for large-scale matrix multiplication and vector operations. The library distinguishes itself through the use of hand-tuned assembly kernels and SIMD instruction mapping, such as AVX and SVE, to maximize floating-point performance on specific CPU architectures. It features a multi-threaded framework that manages parallel exe
DeepGEMM is a suite of specialized GPU kernels and a just-in-time compiler designed for low-precision matrix operations, Mixture-of-Experts models, and attention processing. It provides a library of high-performance matrix multiplication kernels using FP8 precision to increase compute throughput and reduce memory usage. The project features a JIT CUDA kernel compiler that generates and loads optimized compute kernels at runtime to eliminate the need for manual compilation during installation. It includes specialized implementations for grouped matrix multiplication that process multiple group
CuPy is a CUDA array computing library that implements a NumPy-compatible interface for executing array operations and numerical computing on NVIDIA GPUs. It serves as a GPU-accelerated numerical library and a CUDA-based SciPy implementation, offloading heavy calculations to graphics hardware to increase processing speed for scientific and engineering workloads. The library enables multi-framework tensor exchange, allowing data buffers to be shared between different deep learning frameworks using standardized memory layouts to avoid memory copies. It also supports custom GPU kernel integratio
ThinkDSP is a Python-based audio signal processing framework and educational resource designed for studying the mathematical properties of digital audio and waveforms. It functions as a digital signal processing library that provides tools for performing frequency analysis and harmonic decomposition of sound waves. The project covers the fundamentals of audio frequency analysis and sound synthesis, enabling the decomposition of sound into harmonics to analyze or modify spectral content. It facilitates Python audio programming by providing the means to manipulate audio files and generate synth
This project is a comprehensive instructional resource and course for building neural networks using PyTorch. It covers the fundamental building blocks of deep learning, including tensor manipulation, automatic differentiation, and the construction of modular neural network components. The repository serves as a technical guide for several specialized domains. It provides implementation details for computer vision tasks such as image classification, object detection, and semantic segmentation, as well as natural language processing workflows involving transformers, recurrent networks, and gen
Flashlight is a C++ machine learning library and deep learning framework designed for building and training neural networks. It functions as a tensor manipulation library and an automatic differentiation engine that tracks operations to calculate gradients via backpropagation for model optimization. The project is distinguished by its role as a distributed training framework, utilizing all-reduce gradient synchronization and distributed environments to scale machine learning workloads across multiple nodes and devices. It features a backend-agnostic memory interface and RAII-based management
This project is a scientific computing framework for the .NET ecosystem, providing a comprehensive suite of libraries for numerical analysis, statistics, and mathematical optimization. It serves as a foundational toolkit for developing applications in machine learning, digital signal processing, and computer vision. The framework provides specialized toolkits for training and deploying predictive models, including neural networks, support vector machines, and decision trees. It further distinguishes itself with deep integrations for real-time visual analysis, such as object tracking and facia
Flux.jl is a deep learning framework and numerical computing toolkit written in Julia. It serves as a machine learning library for designing and training neural networks, providing a system for automatic differentiation to optimize model parameters. The framework enables deep learning development and machine learning research by representing layers as parameterized functions. It supports scientific machine learning, integrating neural networks into workflows for solving physical and mathematical problems. The toolkit provides native GPU acceleration for tensor computations and utilizes rever
This project is a collection of foundational machine learning algorithms and data science tools implemented in Python. It focuses on building the logic of these tools using basic programming primitives rather than relying on specialized libraries. The implementation covers several core domains, including a linear algebra library for matrix and vector operations, a statistical analysis toolkit for probability and hypothesis testing, and a framework for map-reduce distributed processing. It also includes implementations for natural language processing, graph theory for network analysis, and var
ccv is a computer vision library written in C designed for high-performance visual analysis. It serves as a framework for image classification, object detection, and the identification of faces, pedestrians, and vehicles. The library distinguishes itself through hardware-accelerated vision and deep learning inference optimizations. It utilizes a quantized tensor processor to transform floating-point data into eight-bit integers and implements integer-quantized attention mechanisms to reduce memory bandwidth and increase data throughput. The project covers a broad range of capabilities, inclu
PRML is a Python machine learning library and statistical learning toolkit. It provides code implementations of supervised and unsupervised learning concepts, including regression, classification, and neural network algorithms for statistical data modeling. The project functions as a pattern recognition toolkit used to identify theoretical structures within numerical datasets. It includes a neural network framework for solving nonlinear data mappings and a linear algebra toolkit that utilizes vectorized operations and matrix calculations. The library covers a broad range of capabilities, inc