30 open-source projects similar to numpy/numpy, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Numpy alternative.
SciPy is a scientific computing library for Python that provides a comprehensive collection of mathematical algorithms and numerical tools for research and engineering. It functions as a high-performance numerical analysis framework, bridging high-level Python code with compiled C and Fortran routines to execute complex computations at hardware speeds. The library is built upon array-based data structures that utilize strided memory layouts to enable efficient data manipulation and slicing. By employing vectorized operation dispatch and linking to optimized hardware-specific linear algebra li
xtensor is a C++ multidimensional array library for numerical computing that provides N-dimensional containers with an interface mirroring the NumPy API. It utilizes a lazy evaluation expression engine to defer numerical computations until assignment, which minimizes memory allocations and intermediate copies. The library features a foreign memory array adaptor that allows it to wrap external buffers, such as NumPy arrays, to perform numerical operations in-place without duplicating data. It further optimizes performance through lazy broadcasting and a system that manages the lifetime of temp
ndarray is a multidimensional array library for Rust that serves as a linear algebra framework and scientific computing tool. It provides the core infrastructure for creating and manipulating n-dimensional arrays, functioning as both a parallel array processor and a toolkit for numerical data analysis. The library distinguishes itself by providing efficient slicing and memory views, allowing for data sharing without copying. It leverages optimized backend math libraries for high-speed matrix multiplication and distributes heavy mathematical iterations across multiple CPU threads to accelerate
CuPy is a CUDA array computing library that implements a NumPy-compatible interface for executing array operations and numerical computing on NVIDIA GPUs. It serves as a GPU-accelerated numerical library and a CUDA-based SciPy implementation, offloading heavy calculations to graphics hardware to increase processing speed for scientific and engineering workloads. The library enables multi-framework tensor exchange, allowing data buffers to be shared between different deep learning frameworks using standardized memory layouts to avoid memory copies. It also supports custom GPU kernel integratio
Pandas is a high-performance data analysis library that provides a comprehensive framework for manipulating, cleaning, and transforming structured datasets. It centers on labeled one-dimensional and two-dimensional data structures, allowing users to construct, filter, and reshape tabular information while performing complex arithmetic and logical operations. The library distinguishes itself through a sophisticated indexing engine that enables automatic data alignment during calculations and relational merges. By utilizing a block-based memory layout, it optimizes cache locality for vectorized
PyMC is a Bayesian probabilistic programming framework used for building probabilistic models and performing Bayesian inference. It provides a probabilistic graphical model library for specifying random variables, priors, and likelihood functions, supported by an MCMC sampling engine and variational inference tools to estimate posterior distributions. The framework features a GPU-accelerated inference backend that compiles models into machine code to increase execution speed. It utilizes a backend-agnostic tensor execution model and just-in-time graph compilation to optimize the computation o
Dask is a parallel computing framework and distributed task scheduler designed to scale Python data science workflows from single machines to large clusters. It functions as a cluster resource manager that orchestrates computational logic by representing tasks and their dependencies as directed acyclic graphs. This architecture allows the system to automate the distribution of workloads across available hardware while managing complex execution requirements. The project distinguishes itself through a lazy evaluation engine that defers data operations until they are explicitly requested, enabl
This project is a comprehensive library for numerical linear algebra and scientific computing, designed to provide optimized routines for matrix decomposition, statistical modeling, and high-performance data analysis. It serves as both a toolkit for solving complex linear systems and an educational resource for understanding the fundamental algorithms behind matrix factorizations and numerical solvers. The library distinguishes itself through a focus on randomized numerical linear algebra, utilizing probabilistic algorithms and approximate methods to perform dimensionality reduction and matri
Torch7 is a scientific computing environment and tensor computation library used for deep learning research and numerical analysis. It functions as a Lua-based framework for training neural networks and learning agents, providing a toolkit for implementing architectures and training through reinforcement learning algorithms. The project is distinguished by its tight integration with C, utilizing a binding layer to map high-level scripting to low-level C structures for direct memory access. It supports hardware-accelerated computation by offloading linear algebra and convolution operations to
NumCpp is a C++ framework and numerical computing library that provides a toolkit for multi-dimensional array management and mathematical routines. It functions as a C++ implementation of the NumPy ecosystem, offering a scientific computing framework for managing tensors and performing complex algebraic equations. The project enables high-performance array manipulation within a C++ environment without relying on a Python runtime. It distinguishes itself by providing a NumPy-like interface for executing linear algebra, managing multi-dimensional data structures, and performing numerical proces
This project is a structured learning curriculum and technical reference for mastering deep learning with TensorFlow. It provides a comprehensive guide for building, training, and deploying neural networks, combining theoretical fundamentals with practical implementation examples. The repository distinguishes itself by covering the end-to-end machine learning workflow, from low-level tensor mathematics and linear algebra to the creation of complex model architectures. It includes specific guidance on developing data pipelines for diverse data types, such as images, text, and time-series seque
This project is a curated collection of programming exercises designed to build proficiency in numerical computing and data manipulation. It provides a structured learning path for mastering multidimensional array operations, vectorized arithmetic, and statistical analysis. The repository focuses on developing practical expertise in array-based workflows, emphasizing techniques such as memory management, efficient data processing, and the replacement of explicit loops with vectorized operations. Users engage with hands-on challenges that cover the full lifecycle of numerical data, from initia
Vaex is a high-performance Apache Arrow DataFrame library and out-of-core data processing engine designed to handle billion-row tabular datasets in Python. It functions as a lazy evaluation framework that defers computations and transformations until results are required, enabling the processing of datasets that exceed available system RAM by mapping files directly from disk. The project distinguishes itself as a tool for big data visualization and exploration, specifically integrated for use within interactive notebooks. It provides specialized capabilities for machine learning feature engin
Numba is a just-in-time compiler that translates high-level Python functions into optimized machine code at runtime. By leveraging the LLVM compiler infrastructure, it provides a framework for accelerating numerical data processing and mathematical computations, enabling performance levels comparable to statically compiled languages. The project distinguishes itself through its ability to perform type-inference-based specialization, which generates machine instructions tailored to the specific data types used during execution. It employs a lazy compilation pipeline that defers translation unt
TensorFlow is a comprehensive machine learning framework designed for the construction, training, and deployment of complex mathematical models. It utilizes a graph-based execution model that represents operations as directed acyclic graphs, enabling automatic differentiation and efficient parallel processing. The system provides high-level interfaces for defining neural network architectures, alongside a robust engine for managing multidimensional array structures and tensor mathematics. The framework distinguishes itself through a scalable distributed runtime that orchestrates workloads acr
OpenBLAS is a high-performance implementation of the Basic Linear Algebra Subprograms standard designed for numerical computing and matrix operations. It serves as a hardware-accelerated numerical library and optimized math kernel library, providing a computational engine for large-scale matrix multiplication and vector operations. The library distinguishes itself through the use of hand-tuned assembly kernels and SIMD instruction mapping, such as AVX and SVE, to maximize floating-point performance on specific CPU architectures. It features a multi-threaded framework that manages parallel exe
OpenBLAS is a high-performance library for basic linear algebra subprograms that provides optimized matrix and vector operations. It serves as a multi-architecture math backend and numerical computing framework designed to execute complex mathematical calculations and high-speed numerical analysis. The library functions as an optimized CPU math library that detects hardware at runtime to apply the most efficient operation kernels for the specific processor. It supports multiple CPU targets through a combination of optimized assembly and C implementations. The project covers high-performance
Math.js is a comprehensive JavaScript library for scientific, complex, and arbitrary precision calculations. It functions as a symbolic computation engine, a linear algebra toolkit, a statistical analysis library, and a unit conversion system. The project distinguishes itself by providing a symbolic engine capable of parsing, simplifying, and manipulating mathematical expressions algebraically without requiring immediate numerical evaluation. It includes a framework for defining and converting physical quantities with units of measure and automatic prefix support. The library covers a broad
Smile is a comprehensive JVM machine learning library and statistical computing toolkit. It provides a suite of algorithms for classification, regression, and clustering, implemented natively for Java, Scala, and Kotlin. The project also functions as a deep learning framework, a natural language processing library, and an inference engine for large language models. The library distinguishes itself through GPU acceleration via LibTorch bindings and support for the ONNX model interchange format. It includes specialized capabilities for large language model inference, featuring Byte-Pair Encodin
This project is a community-driven standard library for the Fortran programming language, providing a comprehensive collection of algorithms, data structures, and system utilities. It is designed to extend the language's native capabilities, offering a unified toolkit for scientific computing, numerical analysis, and general-purpose programming. The library distinguishes itself through a modular architecture that utilizes generic interface dispatch and compile-time specialization to ensure high performance across various data types. It provides standardized abstractions for external numerical
This project is a collection of foundational machine learning algorithms and data science tools implemented in Python. It focuses on building the logic of these tools using basic programming primitives rather than relying on specialized libraries. The implementation covers several core domains, including a linear algebra library for matrix and vector operations, a statistical analysis toolkit for probability and hypothesis testing, and a framework for map-reduce distributed processing. It also includes implementations for natural language processing, graph theory for network analysis, and var
This project is a collection of interactive Python notebooks and educational resources designed for mastering data science, machine learning, and numerical computing. It provides a series of practical guides and tutorials covering deep learning, big data processing, and statistical analysis. The repository features specialized instructional suites for implementing classical machine learning algorithms, building deep learning model architectures, and managing AWS cloud infrastructure. It includes dedicated notebooks for data visualization and numerical computing exercises. The project covers
PyTorch is a machine learning framework centered on a GPU-ready tensor library that supports multi-dimensional array operations across both CPU and accelerator hardware. It provides a foundational infrastructure for mathematical computation and dynamic neural network construction, utilizing a tape-based automatic differentiation system that allows for flexible, non-static graph execution. The framework is designed for deep integration with Python, enabling natural usage alongside standard scientific computing ecosystems. It distinguishes itself through a comprehensive distributed training sui
Polars is a high-performance columnar data processing library designed for efficient analytical workflows. It functions as a structured data library that organizes information into typed columns, utilizing the Apache Arrow memory format to enable zero-copy data sharing and cache-friendly, vectorized operations. The engine is built to handle large-scale tabular datasets, providing both local and distributed analytical runtimes that scale from single-machine environments to multi-node clusters. The project distinguishes itself through a sophisticated lazy query engine that constructs abstract e
Scikit-learn is a machine learning library for predictive data analysis that provides a collection of algorithms for supervised and unsupervised learning. It functions as a comprehensive toolkit for data preprocessing, dimensionality reduction, and model selection, allowing users to classify data objects, predict continuous values, and cluster similar items based on historical patterns. The project is defined by a unified interface design where objects either learn from data, transform data, or chain these operations into sequential workflows. To ensure performance on large or high-dimensiona
NetworkX is a Python library designed for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. It provides a comprehensive framework for modeling relationships between entities as graphs, directed graphs, or multigraphs, allowing users to attach arbitrary metadata and properties to nodes and edges. The library distinguishes itself through a modular architecture that decouples graph analysis logic from data storage, utilizing nested dictionaries and adjacency lists to manage topology. It features a pluggable backend system that delegates computat
CVXPY is a Python-embedded domain-specific language for modeling and solving convex optimization problems using natural mathematical syntax. It is built on a disciplined convex programming framework that automatically enforces convexity rules, ensuring that problems formulated by the user are valid for convex solvers. The project also functions as a multi-solver optimization interface, abstracting away backend details and dispatching problems to specialized solvers like ECOS, SCS, and Gurobi without manual configuration. Beyond standard convex optimization, CVXPY extends its reach to geometri
SymPy is a Python computer algebra system and symbolic mathematics library. It performs algebraic manipulations, calculus, and equation solving using symbolic representations to achieve exact computations rather than numerical approximations. The library includes a LaTeX expression parser that converts mathematical strings into symbolic representations for computation and formula manipulation. It also incorporates a mathematical benchmarking suite to measure execution speed and detect performance regressions across different software versions. The system provides capabilities for automated m
Keras is a high-level deep learning framework designed for constructing and training neural networks through the composition of modular, functional layers. It serves as a comprehensive modeling toolkit that provides standardized procedures for defining, evaluating, and deploying complex architectures. By utilizing a directed acyclic graph approach, the framework allows users to build intricate models with multiple inputs, outputs, and shared layers, ensuring consistent numerical execution through functional state management. The project distinguishes itself as a multi-backend machine learning
Modin is a distributed dataframe library and parallel data processing engine designed to handle large datasets that exceed system memory. It functions as a distributed computing framework that parallelizes data manipulation tasks across multiple CPU cores or clusters to increase throughput and avoid memory errors. The project mirrors the Pandas API, allowing for the distribution of data workflows without changing core code logic. It utilizes a pluggable backend interface, which enables users to switch between different distributed execution engines to optimize performance based on available h