# ml-explore/mlx

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/ml-explore-mlx).**

23,986 stars · 1,519 forks · C++ · mit

## Links

- GitHub: https://github.com/ml-explore/mlx
- Homepage: https://ml-explore.github.io/mlx/
- awesome-repositories: https://awesome-repositories.com/repository/ml-explore-mlx.md

## Topics

`mlx`

## Description

This project is a machine learning array framework and tensor computation library designed for high-performance numerical computing. It provides a comprehensive suite of tools for constructing and training neural networks, featuring an automatic differentiation engine that facilitates gradient-based optimization and complex mathematical modeling.

The library distinguishes itself through a unified memory architecture that allows data to be shared across CPU and GPU devices without explicit copies, significantly reducing data movement overhead. Its execution model relies on a lazy evaluation engine and graph-based operation recording, which enables kernel fusion compilation to merge multiple operations into optimized execution units. These capabilities are complemented by stream-based execution control, which manages hardware-level concurrency to maximize throughput during intensive tensor processing.

Beyond its core execution model, the framework supports a broad range of capabilities including distributed sharding infrastructure for scaling workloads across multiple devices, and extensive utilities for model weight management and serialization. It provides a deep library of mathematical and statistical operations, alongside specialized functions for quantized matrix multiplication and autoregressive text generation.

The project is implemented in C++ and includes build-time configuration options to tailor hardware backends and compilation settings for specific deployment environments.

## Tags

### Artificial Intelligence & ML

- [Tensor Computing Libraries](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-optimization-and-inference/hardware-and-acceleration/tensor-computing-libraries.md) — Provides a high-performance toolkit for tensor manipulation and hardware-accelerated mathematical operations across CPU and GPU devices.
- [Deep Learning Libraries](https://awesome-repositories.com/f/artificial-intelligence-ml/deep-learning-libraries.md) — Provides a framework for constructing and training neural networks with custom modules.
- [Gradient Computation](https://awesome-repositories.com/f/artificial-intelligence-ml/gradient-computation.md) — Computes derivatives of functions with respect to specific inputs or nested data structures, facilitating gradient-based optimization. ([source](https://ml-explore.github.io/mlx/build/html/usage/function_transforms.html))
- [Machine Learning Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning-frameworks.md) — Provides multi-dimensional arrays and automatic differentiation for efficient machine learning.
- [Automatic Differentiation Systems](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/frameworks/model-construction/automatic-differentiation-systems.md) — Provides an automatic differentiation engine for computing gradients in neural network training and complex mathematical modeling.
- [Neural Network Modules](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-network-modules.md) — Provides a base class for defining neural network modules with parameter registration and forward logic. ([source](https://ml-explore.github.io/mlx/build/html/examples/mlp.html))
- [Numerical Computing Libraries](https://awesome-repositories.com/f/artificial-intelligence-ml/numerical-computing-libraries.md) — Provides high-performance linear algebra and multidimensional array operations for numerical computing.
- [Hardware Acceleration](https://awesome-repositories.com/f/artificial-intelligence-ml/hardware-acceleration.md) — Directs computational tasks to specific hardware streams to maximize tensor processing throughput.
- [Distributed Learning](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/distributed-and-scaling-strategies/distributed-learning.md) — Scales training and inference workloads across multiple compute nodes by synchronizing gradients.
- [Neural Networks](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/machine-learning-concepts/network-architectures-and-layers/neural-networks.md) — Provides tools for building and training custom neural network architectures.
- [Model Training Optimizers](https://awesome-repositories.com/f/artificial-intelligence-ml/model-training-optimizers.md) — Updates model weights using standard optimization algorithms based on computed gradients to minimize loss during the training process. ([source](https://ml-explore.github.io/mlx/build/html/examples/mlp.html))
- [Distributed Training Sharding](https://awesome-repositories.com/f/artificial-intelligence-ml/distributed-training-sharding.md) — Implements distributed sharding infrastructure to partition model parameters across multiple devices for parallel processing.
- [Distributed Gradient Synchronization](https://awesome-repositories.com/f/artificial-intelligence-ml/gradient-computation/distributed-gradient-synchronization.md) — Coordinates gradient updates across multiple compute nodes to enable parallel processing of large datasets across distributed hardware resources. ([source](https://ml-explore.github.io/mlx/build/html/examples/data_parallelism.html))
- [Kernel Fusion Compilers](https://awesome-repositories.com/f/artificial-intelligence-ml/kernel-fusion-compilers.md) — Merges multiple operations into single optimized execution units to reduce memory bandwidth and improve processing speed.
- [Machine Learning Model Portability](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-management/machine-learning-model-portability.md) — Optimizes and prepares pre-trained model weights and graphs for efficient production deployment.
- [Model Serializers](https://awesome-repositories.com/f/artificial-intelligence-ml/model-serializers.md) — Provides utilities for saving model and optimizer states to persistent storage. ([source](https://ml-explore.github.io/mlx/build/html/usage/export.html))
- [Convolutional Operations](https://awesome-repositories.com/f/artificial-intelligence-ml/convolutional-operations.md) — Computes discrete convolutions across multiple dimensions with support for custom strides, padding, and dilations. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.convolve.html))
- [Lazy Evaluation Engines](https://awesome-repositories.com/f/artificial-intelligence-ml/lazy-evaluation-engines.md) — Defers the execution of array operations until results are explicitly requested to optimize memory usage and scheduling.
- [Graph Serialization Formats](https://awesome-repositories.com/f/artificial-intelligence-ml/model-serialization-formats/graph-serialization-formats.md) — Serializes and saves optimized function graphs to disk for production deployment. ([source](https://ml-explore.github.io/mlx/build/html/index.html))
- [Masked Multiplication Operators](https://awesome-repositories.com/f/artificial-intelligence-ml/sparse-softmax-kernels/masked-softmax-operators/masked-multiplication-operators.md) — The library multiplies two arrays while applying block-level masks to the input or output matrices to selectively ignore specific segments. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.block_masked_mm.html))
- [Tensor Parallelism](https://awesome-repositories.com/f/artificial-intelligence-ml/tensor-parallelism.md) — Converts linear layers into parallelized versions that shard weights across multiple devices. ([source](https://ml-explore.github.io/mlx/build/html/examples/tensor_parallelism.html))
- [Text Generation](https://awesome-repositories.com/f/artificial-intelligence-ml/text-generation.md) — Processes input prompts and produces tokens autoregressively using a generator function. ([source](https://ml-explore.github.io/mlx/build/html/examples/llama-inference.html))
- [Data Type Converters](https://awesome-repositories.com/f/artificial-intelligence-ml/data-type-converters.md) — Provides utilities for inspecting and converting tensor data types to manage precision and memory. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.DtypeCategory.html))
- [Einstein Summation Utilities](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-optimization-and-inference/hardware-and-acceleration/tensor-computing-libraries/tensor-operations/einstein-summation-utilities.md) — Computes multidimensional array operations using the Einstein summation convention for complex tensor contractions. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.einsum.html))
- [Model Weight Converters](https://awesome-repositories.com/f/artificial-intelligence-ml/model-weight-converters.md) — Transforms pre-trained model parameters from common file formats into structured formats for inference. ([source](https://ml-explore.github.io/mlx/build/html/examples/llama-inference.html))
- [Gathered Matrix Multiplication](https://awesome-repositories.com/f/artificial-intelligence-ml/matrix-operation-fusions/gathered-matrix-multiplication.md) — Provides optimized operations that combine index-based data selection with matrix multiplication for improved performance. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.gather_mm.html))
- [Statistical Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/statistical-analysis.md) — Computes standard deviations and variances of array elements along specified axes. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.array.std.html))
- [Dimension Squeezing](https://awesome-repositories.com/f/artificial-intelligence-ml/tensor-reductions/dimension-preservers/dimension-squeezing.md) — The library eliminates dimensions of size one from an array to simplify its shape while preserving the underlying data. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.array.squeeze.html))
- [Weight Initialization](https://awesome-repositories.com/f/artificial-intelligence-ml/weight-initialization.md) — Imports serialized weight files into model architectures to prepare layers for inference. ([source](https://ml-explore.github.io/mlx/build/html/examples/llama-inference.html))

### Scientific & Mathematical Computing

- [Graph-Based Computational Execution](https://awesome-repositories.com/f/scientific-mathematical-computing/data-modeling-processing/computational-graphs/graph-based-computational-execution.md) — Captures sequences of mathematical operations as a graph to enable automatic differentiation and kernel fusion.
- [Unified Memory Systems](https://awesome-repositories.com/f/scientific-mathematical-computing/unified-memory-systems.md) — Shares array data across CPU and GPU devices without explicit copies to minimize data movement overhead.
- [Matrix Operations](https://awesome-repositories.com/f/scientific-mathematical-computing/high-performance-execution-environments/scientific-computing-platforms/scientific-computing/matrix-operations.md) — Performs essential linear algebra operations including matrix multiplication and scaling for neural network layers. ([source](https://ml-explore.github.io/mlx/build/html/python/ops.html))
- [Quantized Matrix Multiplication](https://awesome-repositories.com/f/scientific-mathematical-computing/high-performance-execution-environments/scientific-computing-platforms/scientific-computing/matrix-operations/matrix-vector-products/quantized-matrix-multiplication.md) — Enables efficient inference by performing matrix multiplication using quantized weights and scale parameters. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.gather_qmm.html))
- [Vectorized Array Operations](https://awesome-repositories.com/f/scientific-mathematical-computing/high-performance-execution-environments/scientific-computing-platforms/scientific-computing/vectorized-array-operations.md) — Transforms functions to operate over batches of data automatically by mapping operations across specified array dimensions to improve execution performance. ([source](https://ml-explore.github.io/mlx/build/html/usage/function_transforms.html))
- [Mathematical Function Transformation](https://awesome-repositories.com/f/scientific-mathematical-computing/numerical-mathematical-foundations/mathematical-libraries-and-utilities/mathematical-function-transformation.md) — Applies automatic differentiation and vectorization to functions to compose gradients, Jacobians, and batch operations for complex numerical workflows. ([source](https://ml-explore.github.io/mlx/build/html/usage/quick_start.html))

### Data & Databases

- [Matrix Multiplication Utilities](https://awesome-repositories.com/f/data-databases/batch-processing/batch-matrix-multiplication-utilities/matrix-multiplication-utilities.md) — Calculates the product of two arrays, supporting batched operations and automatic broadcasting. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.matmul.html))
- [Array View Creation](https://awesome-repositories.com/f/data-databases/data-processing-pipelines/data-transformation/array-tensor-manipulation/array-filtering/array-view-creation.md) — Creates memory-efficient array views that reinterpret data without explicit copies. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.array.view.html))
- [Data Storage](https://awesome-repositories.com/f/data-databases/data-engineering-infrastructure/data-persistence-storage/data-storage.md) — Supports reading and writing numerical data using binary formats like npy, safetensors, and gguf. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.load.html))
- [Cross-Device Operation Execution](https://awesome-repositories.com/f/data-databases/data-synchronization/cross-device-synchronization-engines/cross-device-operation-execution.md) — Performs computations across different processors without manual data movement by utilizing a shared memory architecture. ([source](https://ml-explore.github.io/mlx/build/html/usage/unified_memory.html))
- [Device Configuration](https://awesome-repositories.com/f/data-databases/resource-allocation/device-allocators/device-configuration.md) — Directs array operations and memory allocation to specific hardware accelerators for consistent execution. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.set_default_device.html))
- [Intermediate Output Caching](https://awesome-repositories.com/f/data-databases/data-engineering-infrastructure/caching-performance/caching-strategies/query-result-caching/method-result-caches/intermediate-output-caching.md) — Stores attention layer results during token generation to prevent redundant calculations and speed up sequence processing time. ([source](https://ml-explore.github.io/mlx/build/html/examples/llama-inference.html))
- [Arithmetic Aggregators](https://awesome-repositories.com/f/data-databases/column-transformation/arithmetic-aggregators.md) — Computes arithmetic means, sums, and products of array elements along specified axes. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.array.mean.html))
- [Statistical Aggregators](https://awesome-repositories.com/f/data-databases/data-analysis-visualization/analytical-platforms-engines/advanced-analytics-functions/statistical-aggregators.md) — Aggregates array data into summary statistics or boolean states along specified axes. ([source](https://ml-explore.github.io/mlx/build/html/python/ops.html))
- [Data Format Interoperability](https://awesome-repositories.com/f/data-databases/data-format-interoperability.md) — Enables data exchange between frameworks using buffer protocols and standard serialization formats. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.array.tolist.html))
- [Index Sorting Utilities](https://awesome-repositories.com/f/data-databases/data-sorting-engines/index-sorting-utilities.md) — The library identifies indices for sorted or partitioned array elements to enable efficient ranking and selection of data points. ([source](https://ml-explore.github.io/mlx/build/html/python/ops.html))

### Programming Languages & Runtimes

- [Arithmetic Broadcasting](https://awesome-repositories.com/f/programming-languages-runtimes/language-features-paradigms/language-features/array-operations/arithmetic-broadcasting.md) — The library computes the sum of two arrays or scalars using broadcasting semantics to align dimensions automatically for efficient numerical operations. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.add.html))
- [Element-wise Comparisons](https://awesome-repositories.com/f/programming-languages-runtimes/programming-utilities/data-structure-type-helpers/data-type-utilities/array-element-finding/element-wise-comparisons.md) — The library evaluates whether elements in one array are strictly greater than corresponding elements in another using broadcasting to handle different shapes. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.equal.html))
- [Axis Transformations](https://awesome-repositories.com/f/programming-languages-runtimes/language-features-paradigms/language-features/array-operations/axis-transformations.md) — The library swaps two specified axes of an array to reorder its data structure while maintaining the underlying memory layout. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.array.swapaxes.html))
- [Joining](https://awesome-repositories.com/f/programming-languages-runtimes/language-features-paradigms/language-features/array-operations/joining.md) — The library combines a sequence of arrays into a single array along a specified axis to aggregate data structures for further processing. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.concatenate.html))
- [Reshaping](https://awesome-repositories.com/f/programming-languages-runtimes/language-features-paradigms/language-features/array-operations/reshaping.md) — The library inserts new dimensions of size one into an array at specified axes to adjust its shape for broadcasting or compatibility. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.expand_dims.html))
- [Splitting Utilities](https://awesome-repositories.com/f/programming-languages-runtimes/language-features-paradigms/language-features/array-operations/splitting-utilities.md) — The library divides an array into multiple smaller arrays along a specified axis by providing either a number of sections or split indices. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.array.split.html))
- [Array Access and Modification](https://awesome-repositories.com/f/programming-languages-runtimes/programming-utilities/data-structure-type-helpers/data-type-utilities/array-element-finding/array-modification-utilities/array-access-and-modification.md) — Enables efficient manipulation of array data structures through slicing and index-based updates. ([source](https://ml-explore.github.io/mlx/build/html/usage/indexing.html))

### Software Engineering & Architecture

- [Execution Stream Management](https://awesome-repositories.com/f/software-engineering-architecture/concurrent-execution-managers/execution-stream-management.md) — Manages hardware-level concurrency by isolating computational tasks into independent streams for parallel processing. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.clear_streams.html))
- [Computational Graph Optimizers](https://awesome-repositories.com/f/software-engineering-architecture/performance-reliability/performance-optimization/computational-efficiency/computational-graph-optimizers.md) — Compiles functions to merge operations and fuse kernels, reducing memory usage and increasing execution speed for complex workflows. ([source](https://ml-explore.github.io/mlx/build/html/usage/compile.html))
- [Shared Memory Management](https://awesome-repositories.com/f/software-engineering-architecture/shared-memory-management.md) — Shares array data across CPU and GPU devices without explicit copies to minimize overhead. ([source](https://ml-explore.github.io/mlx/build/html/index.html))
- [Graph Evaluation Scheduling](https://awesome-repositories.com/f/software-engineering-architecture/execution-graphs/graph-evaluation-scheduling.md) — Triggers the execution of accumulated compute graphs at specific intervals to balance processing overhead against the benefits of batching. ([source](https://ml-explore.github.io/mlx/build/html/usage/lazy_evaluation.html))
- [Execution Pipeline Transformation](https://awesome-repositories.com/f/software-engineering-architecture/transformation-pipelines/execution-pipeline-transformation.md) — Applies multiple function modifications like gradients or compilation in sequence to build complex and highly optimized workflows. ([source](https://ml-explore.github.io/mlx/build/html/usage/compile.html))

### System Administration & Monitoring

- [Hardware Acceleration Managers](https://awesome-repositories.com/f/system-administration-monitoring/device-management-tools/hardware-acceleration-managers.md) — Manages hardware-level concurrency and workload distribution across available processing units. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/stream_class.html))

### DevOps & Infrastructure

- [Distributed Computing](https://awesome-repositories.com/f/devops-infrastructure/distributed-computing.md) — Shares processing loads across multiple physical machines using communication backends. ([source](https://ml-explore.github.io/mlx/build/html/usage/distributed.html))

### Networking & Communication

- [Distributed Parameter Sharding](https://awesome-repositories.com/f/networking-communication/distributed-systems-p2p/distributed-computing/model-parallelism-techniques/distributed-parameter-sharding.md) — Splits model parameters across multiple devices in-place to reduce memory footprint. ([source](https://ml-explore.github.io/mlx/build/html/examples/tensor_parallelism.html))
- [Distributed Execution Runtimes](https://awesome-repositories.com/f/networking-communication/distributed-systems-p2p/distributed-computing/distributed-execution-runtimes.md) — Manages process initialization and environment configuration for distributed script execution. ([source](https://ml-explore.github.io/mlx/build/html/usage/distributed.html))

### Development Tools & Productivity

- [Build Configurations](https://awesome-repositories.com/f/development-tools-productivity/build-tooling/build-orchestration-logic/build-orchestration-configuration/build-configuration-systems/build-configurations.md) — Configures build-time parameters including hardware backends and compilation settings for deployment. ([source](https://ml-explore.github.io/mlx/build/html/install.html))
- [Lazy Initialization](https://awesome-repositories.com/f/development-tools-productivity/lazy-initialization.md) — Postpones array operations and memory allocation until results are explicitly requested to optimize resource usage and speed up initialization. ([source](https://ml-explore.github.io/mlx/build/html/usage/lazy_evaluation.html))
- [Variable Input Shape Support](https://awesome-repositories.com/f/development-tools-productivity/input-binding-libraries/variable-input-shape-support.md) — Processes inputs with changing dimensions in compiled functions without triggering expensive recompilation cycles to maintain high performance. ([source](https://ml-explore.github.io/mlx/build/html/usage/compile.html))

### Education & Learning Resources

- [Array Initialization](https://awesome-repositories.com/f/education-learning-resources/array-tutorials/array-initialization.md) — Provides utilities for initializing arrays with specific shapes, types, and constant values. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.identity.html))
- [Quantization Reconstructors](https://awesome-repositories.com/f/education-learning-resources/matrix-operations/quantization-reconstructors.md) — Reconstructs high-precision numerical arrays from quantized representations using scales and biases. ([source](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.dequantize.html))