# state-spaces/mamba

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/state-spaces-mamba).**

17,215 stars · 1,596 forks · Python · apache-2.0

## Links

- GitHub: https://github.com/state-spaces/mamba
- awesome-repositories: https://awesome-repositories.com/repository/state-spaces-mamba.md

## Description

Mamba is a deep learning framework designed for building and training sequence models that process long-range data dependencies with linear-time computational efficiency. By utilizing selective state space modeling, the library enables the construction of neural network architectures that replace traditional attention mechanisms with high-performance state space operations.

The framework distinguishes itself through the use of data-dependent state gating, which allows the model to dynamically filter information flow based on the input sequence. To ensure high throughput, it incorporates hardware-optimized custom kernels that execute complex state space calculations directly on graphics processing units. These operations are supported by a parallel scanning algorithm that avoids the quadratic memory costs typically associated with long-sequence processing.

The library provides a comprehensive suite of tools for constructing deep neural networks by stacking selective state space blocks into hierarchical backbones. It supports large-scale training and inference through tensor-parallel distribution strategies, allowing model parameters to be split across multiple hardware devices. Additionally, the framework includes utilities for weight initialization, pre-trained model loading, and performance benchmarking to facilitate end-to-end sequence modeling workflows.

Installation includes the compilation of specialized source code to ensure that custom kernels are optimized for the target hardware environment.

## Tags

### Web Development

- [Deep Learning Frameworks](https://awesome-repositories.com/f/web-development/state-management-models/state-space-models/deep-learning-frameworks.md) — Provides a deep learning framework for building sequence models that process long-range dependencies with linear-time efficiency.
- [State-Space Models](https://awesome-repositories.com/f/web-development/state-management-models/state-space-models.md) — Implements selective state space modeling to process sequential data with linear-time complexity.
- [MIMO State Space Configurations](https://awesome-repositories.com/f/web-development/state-management-models/state-space-models/mimo-state-space-configurations.md) — Defines multi-input multi-output architectures with configurable parameters to optimize memory and throughput. ([source](https://github.com/state-spaces/mamba/blob/main/mamba_ssm/modules/mamba3.py))

### Artificial Intelligence & ML

- [Selective State Space Models](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/architectures/sequence-models/selective-state-space-models.md) — Implements selective state space modeling to process long-range dependencies with linear-time efficiency. ([source](https://github.com/state-spaces/mamba/blob/main/pyproject.toml))
- [Linear-Time Sequence Models](https://awesome-repositories.com/f/artificial-intelligence-ml/sequence-learning-models/linear-time-sequence-models.md) — Processes long sequences of data with linear computational efficiency to capture complex dependencies.
- [Selective State Space Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/deep-learning-framework-implementations/selective-state-space-implementations.md) — Constructs deep learning sequence modeling blocks that process data in linear time. ([source](https://github.com/state-spaces/mamba/blob/main/README.md))
- [Hardware Acceleration Kernels](https://awesome-repositories.com/f/artificial-intelligence-ml/hardware-acceleration-kernels.md) — Ships hardware-optimized custom kernels that execute complex state space operations directly on graphics processing units.
- [Inference Optimization Kernels](https://awesome-repositories.com/f/artificial-intelligence-ml/inference-optimization-kernels.md) — Includes optimized hardware-specific kernels for executing complex state space calculations during model training and inference.
- [Large-Scale Model Training](https://awesome-repositories.com/f/artificial-intelligence-ml/large-scale-model-training.md) — Enables large-scale model training through tensor-parallel distribution and specialized weight initialization strategies.
- [Sequence Generation Runtimes](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/frameworks/inference-runtimes/high-performance-ai-inference/sequence-generation-runtimes.md) — Executes optimized sequence generation tasks using custom hardware kernels for rapid text completion.
- [Selective State Scanning Operations](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-optimization-and-inference/training-algorithms/deep-learning-optimization/selective-state-scanning-operations.md) — Executes high-performance selective state space model operations using optimized hardware kernels. ([source](https://github.com/state-spaces/mamba/blob/main/mamba_ssm/ops/selective_scan_interface.py))
- [Tensor Parallelism](https://awesome-repositories.com/f/artificial-intelligence-ml/tensor-parallelism.md) — Supports tensor parallelism to split large model parameters across multiple hardware devices for efficient training and inference.
- [Deep Learning Architectures](https://awesome-repositories.com/f/artificial-intelligence-ml/deep-learning-architectures.md) — Provides a collection of neural network blocks designed to replace traditional attention mechanisms with state space operations.
- [Inference Benchmarking Tools](https://awesome-repositories.com/f/artificial-intelligence-ml/inference-benchmarking-tools.md) — Provides utilities for measuring the generation speed and computational throughput of sequence models during inference. ([source](https://github.com/state-spaces/mamba#readme))
- [Sequence Models](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/architectures/sequence-models.md) — Builds deep learning architectures by stacking state space layers to process sequences with linear-time efficiency. ([source](https://github.com/state-spaces/mamba/blob/main/mamba_ssm/models/mixer_seq_simple.py))
- [Selective State Modeling Utilities](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-management/model-selection-utilities/selective-state-modeling-utilities.md) — Processes sequential data using selective state space mechanisms for efficient long-range dependency modeling. ([source](https://github.com/state-spaces/mamba/blob/main/mamba_ssm/modules/mamba_simple.py))
- [Selective State Neural Architectures](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-network-building-blocks/selective-state-neural-architectures.md) — Constructs end-to-end sequence models by stacking state space blocks with task-specific output heads. ([source](https://github.com/state-spaces/mamba#readme))
- [Selective State Backbones](https://awesome-repositories.com/f/artificial-intelligence-ml/pre-training-pipelines/backbone-model-integration/selective-state-backbones.md) — Constructs deep neural network backbones by stacking selective state space blocks. ([source](https://github.com/state-spaces/mamba/blob/main/README.md))
- [Sequence Modeling](https://awesome-repositories.com/f/artificial-intelligence-ml/sequence-modeling.md) — Processes input sequences in linear time using selective state space models for long-range dependencies. ([source](https://github.com/state-spaces/mamba/blob/f0affcf69f06d1d06cef018ff640bf080a11c421/mamba_ssm/modules/mamba_simple.py))
- [Sequence Completion Sampling](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/decoding-generation-controls/ai-completion-services/ai-completion-sampling/sequence-completion-sampling.md) — Predicts subsequent tokens using loaded models with configurable sampling parameters. ([source](https://github.com/state-spaces/mamba/blob/main/README.md))
- [Generative Text Inference](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/generative-ai/generative-text-inference.md) — Produces sequences from a prompt using trained models with configurable sampling parameters. ([source](https://github.com/state-spaces/mamba#readme))
- [Hidden State Gate Controllers](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/frameworks/model-construction/sequential-containers/recurrent-state-managers/hidden-state-gate-controllers.md) — Implements gating logic that dynamically filters information flow based on input sequences within recurrent state architectures.
- [Model Loading](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/data-and-checkpointing/model-loading.md) — Downloads and initializes pre-trained model weights from remote repositories for immediate inference. ([source](https://github.com/state-spaces/mamba#readme))
- [Pretrained Sequence Model Loaders](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/data-and-checkpointing/model-loading/pretrained-sequence-model-loaders.md) — Downloads and initializes pre-trained model weights from external repositories for inference. ([source](https://github.com/state-spaces/mamba/blob/main/README.md))
- [Heavy-Tail Activations](https://awesome-repositories.com/f/artificial-intelligence-ml/activation-functions/heavy-tail-activations.md) — Stabilizes training by applying specific activation functions to data-dependent state parameters. ([source](https://github.com/state-spaces/mamba/blob/main/mamba_ssm/modules/mamba3.py))
- [Neural Network Layers](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/frameworks/model-construction/neural-network-layers.md) — Provides architectural building blocks for stacking selective state space layers into hierarchical neural network backbones.
- [Weight Initialization](https://awesome-repositories.com/f/artificial-intelligence-ml/weight-initialization.md) — Applies specialized scaling schemes to neural network parameters to stabilize training across deep residual architectures. ([source](https://github.com/state-spaces/mamba/blob/main/mamba_ssm/models/mixer_seq_simple.py))

### Software Engineering & Architecture

- [Custom C++ Kernels](https://awesome-repositories.com/f/software-engineering-architecture/performance-reliability/performance-optimization/computational-efficiency/custom-kernel-accelerators/custom-c-kernels.md) — Provides hardware-optimized custom kernels to maximize throughput for complex state space calculations. ([source](https://github.com/state-spaces/mamba/blob/main/MANIFEST.in))

### DevOps & Infrastructure

- [Parallel Scanning Algorithms](https://awesome-repositories.com/f/devops-infrastructure/scan-orchestration/parallel-scanning-algorithms.md) — Implements parallel scanning algorithms to compute hidden state transitions with linear-time efficiency.

### Networking & Communication

- [Distributed Parameter Sharding](https://awesome-repositories.com/f/networking-communication/distributed-systems-p2p/distributed-computing/model-parallelism-techniques/distributed-parameter-sharding.md) — Splits model parameters and sequence processing across multiple devices using tensor parallelism. ([source](https://github.com/state-spaces/mamba/blob/main/mamba_ssm/modules/mamba2.py))
