# microsoft/LoRA

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/microsoft-lora).**

13,264 stars · 886 forks · Python · mit

## Links

- GitHub: https://github.com/microsoft/LoRA
- Homepage: https://arxiv.org/abs/2106.09685
- awesome-repositories: https://awesome-repositories.com/repository/microsoft-lora.md

## Topics

`adaptation` `deberta` `deep-learning` `gpt-2` `gpt-3` `language-model` `lora` `low-rank` `pytorch` `roberta`

## Description

LoRA is a framework for parameter-efficient fine-tuning of large-scale neural networks. It functions by injecting trainable low-rank decomposition matrices into frozen model layers, allowing for task-specific adaptation while preserving the integrity of the original base model weights.

The project distinguishes itself by enabling the direct merging of these trained low-rank matrices into primary model weights. This process eliminates additional computational overhead during inference, ensuring that adapted models maintain the same performance characteristics as the original architecture. Furthermore, the framework supports modular adaptation, allowing users to swap between different task-specific configurations by loading and unloading lightweight matrices without modifying the underlying model.

The toolkit provides comprehensive support for optimizing the entire model lifecycle, including storage-efficient checkpointing and targeted updates to bias vectors. By training only a small fraction of the total parameters, the library reduces the disk space required for model storage and facilitates the deployment of adapted states across diverse hardware systems.

## Tags

### Artificial Intelligence & ML

- [Weight Merging Utilities](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/architectures/instruction-tuned-language-models/weight-space-merging-techniques/weight-merging-utilities.md) — Integrates trained adapter matrices directly into base model weights to eliminate inference latency during execution.
- [Large Language Model Fine-Tuning Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/integrated-development-platforms/machine-learning-platforms/large-language-model-fine-tuning-frameworks.md) — Adapts massive pre-trained neural networks to specific tasks by training only a tiny fraction of the total model parameters.
- [Frozen Base Models](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/architecture-and-operations/model-architecture/frozen-base-models.md) — Preserves base model integrity by keeping primary parameters immutable while training only injected adaptation matrices.
- [Parameter Efficient Fine-Tuning](https://awesome-repositories.com/f/artificial-intelligence-ml/parameter-efficient-fine-tuning.md) — Provides a framework for training large language models by injecting trainable low-rank decomposition matrices into frozen layers.
- [Parameter Adaptation Techniques](https://awesome-repositories.com/f/artificial-intelligence-ml/parameter-adaptation-techniques.md) — Enables efficient model refinement by injecting trainable low-rank decomposition matrices into frozen neural network layers. ([source](https://github.com/microsoft/LoRA#readme))
- [Projection Merging](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/architectures/instruction-tuned-language-models/weight-space-merging-techniques/weight-merging-utilities/projection-merging.md) — Combines multiple projection matrices into single layers to optimize performance and reduce computational overhead during inference. ([source](https://github.com/microsoft/LoRA#readme))
- [Inference Optimization Tools](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-optimization-and-inference/serving-and-runtime/inference-optimization-utilities/inference-optimization-tools.md) — Merges task-specific adaptation matrices into primary model weights to eliminate inference overhead and minimize checkpoint storage size.
- [Inference Optimizations](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-optimization-and-inference/serving-and-runtime/inference-optimizations.md) — Merges task-specific adaptation weights into primary model layers to eliminate computational overhead and latency during live model execution.
- [Model Adapters](https://awesome-repositories.com/f/artificial-intelligence-ml/model-adapters.md) — Supports modular swapping of task-specific configurations by loading and unloading lightweight adaptation matrices at runtime.
- [Model Checkpointing](https://awesome-repositories.com/f/artificial-intelligence-ml/model-checkpointing.md) — Minimizes disk space requirements by saving only small task-specific adaptation matrices instead of full model weights.
- [Neural Network Layers](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/frameworks/model-construction/neural-network-layers.md) — Injects trainable low-rank decomposition matrices into frozen model layers to refine performance while preserving original base knowledge.
- [Embedding Bias Adjustments](https://awesome-repositories.com/f/artificial-intelligence-ml/natural-language-processing/word-embeddings/embedding-bias-adjustments.md) — Optimizes task performance by training specific bias vectors alongside low-rank matrices for improved parameter efficiency.
- [Neural Network Layers](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-network-layers.md) — Applies low-rank updates to targeted weight subsets, such as projection matrices or embedding layers, to improve performance. ([source](https://github.com/microsoft/LoRA#readme))

### Scientific & Mathematical Computing

- [Low-Rank Decompositions](https://awesome-repositories.com/f/scientific-mathematical-computing/high-performance-execution-environments/scientific-computing-platforms/scientific-computing/matrix-operations/matrix-vector-products/low-rank-decompositions.md) — Reduces trainable parameter counts by representing large weight updates as the product of two significantly smaller matrices.

### Data & Databases

- [Data Storage Optimizers](https://awesome-repositories.com/f/data-databases/data-storage-optimizers.md) — Minimizes storage requirements by saving only small task-specific adaptation matrices instead of storing entire sets of original model weights. ([source](https://github.com/microsoft/LoRA#readme))

### Content Management & Publishing

- [Model Checkpoint Exporters](https://awesome-repositories.com/f/content-management-publishing/content-formats-exporting/export-formats/model-checkpoint-exporters.md) — Extracts and saves adapted parameter states into portable file formats to facilitate deployment across different hardware systems. ([source](https://github.com/microsoft/LoRA#readme))

### Software Engineering & Architecture

- [Inference Task Interruption](https://awesome-repositories.com/f/software-engineering-architecture/execution-pausing/inference-task-interruption.md) — Supports instant switching between different task-specific model configurations without introducing processing delays during live inference. ([source](https://github.com/microsoft/LoRA#readme))
