30 open-source projects similar to cloneofsimo/lora, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Lora alternative.
sd-scripts is a suite of utilities designed for fine-tuning generative models, preprocessing datasets, and converting model weights. It provides a collection of scripts for executing Stable Diffusion training through methods such as DreamBooth, textual inversion, and full fine-tuning, alongside a framework for creating and managing Low-Rank Adaptation weights. The project features specialized capabilities for model weight conversion between different architectures and precision formats. It includes tools for merging adaptation weights into base models, extracting weights from trained models,
MiniCPM is a collection of small language models designed for local, on-device deployment in resource-constrained environments. The project focuses on running dense Transformer models on consumer hardware, including GPUs, CPUs, and Apple Silicon, without requiring custom code forks. The project distinguishes itself through heavy optimization for edge hardware, utilizing quantized weight compression in GGUF and MLX formats to reduce memory overhead. It implements advanced inference techniques such as speculative sampling and radix-tree prefix caching to accelerate generation speed and throughp
This project is a LLaMA fine-tuning framework and training pipeline designed for instruction tuning using low-rank adaptation. It provides a system for adapting large language models through a portable, containerized machine learning environment and a web-based inference interface. The framework enables the training of low-rank adapters and the subsequent merging of these weights back into base models to create standalone checkpoints. It includes utilities for defining and formatting prompt templates to ensure consistent data structures during the fine-tuning and inference processes. The pro
ComfyUI-nunchaku is a 4-bit diffusion inference engine and a set of nodes for running low-precision quantized diffusion models within ComfyUI visual workflows. It provides a backend that reduces memory overhead and increases generation speed for transformer models. The project includes specialized tools for identity-preserving generation and an image-to-image guidance toolkit that uses depth maps and reference images. It also features a multimodal visual question answering implementation and a utility for merging multiple quantized model files into single unified files. The engine covers a b
LMFlow is a comprehensive suite for large language model fine-tuning, context extension, multimodal processing, and inference execution. It provides a toolkit for updating model parameters through full tuning or memory-efficient adapter algorithms, alongside an inference engine for executing tuned models via command-line or web-based interfaces. The framework includes a dedicated alignment suite for supervised tuning and reward model training to refine model behavior. It features a context window extender to increase maximum input lengths and a multimodal framework for building chatbots that
Sana is a framework for high-resolution image and video synthesis based on a linear diffusion transformer. It provides a toolkit for the training, fine-tuning, and execution of text-to-image and text-to-video models, as well as a video generative world model capable of simulating physical environments with precise spatial control. The project is distinguished by its use of linear complexity layers to handle high resolutions and its support for long-form, minute-length video generation in real time. It implements a two-stage inference paradigm that separates structural generation from visual t
IF is a text-to-image diffusion system that translates natural language descriptions into visual imagery. The project provides a generative pipeline for creating images, an inpainting tool for modifying specific image sections, and a super-resolution upscaler to increase pixel density and clarity. The system includes a concept fine-tuning framework that allows for the teaching of new visual concepts by updating a small set of parameters. It also supports image style transfer to apply the aesthetic characteristics of a reference image to a new output.
Facechain is a generative AI toolchain and portrait generator designed to create personalized synthetic identities and consistent digital portraits. It provides a pipeline for training and refining diffusion models to produce subject-driven image synthesis from reference photos. The project focuses on digital twin generation, enabling the creation of a personalized model from a single image to maintain identity consistency across various poses and artistic styles. It utilizes identity fusion and similarity sorting to balance facial accuracy with stylized visual effects. The toolkit covers a
This project provides a Chinese large language model based on the LLaMA architecture. It is an instruction-tuned model optimized for natural language processing and multi-turn conversations in Chinese. The system includes a framework for parameter-efficient fine-tuning using low-rank adaptation and quantization to reduce memory requirements. It also implements retrieval augmented generation for local document question answering and supports long-context processing for sequences up to 64K tokens. The project covers a broad set of capabilities including supervised instruction tuning, reinforce
OpenNMT-py is a PyTorch neural machine translation framework used for training and deploying neural machine translation and large language models. It functions as a distributed model training system, an inference engine, and a toolkit for fine-tuning large language models. The framework distinguishes itself with a dedicated toolkit for adapting large language models through low-rank adaptation, quantization, and instruction tuning. It also includes a neural machine translation server that allows trained models to be hosted and exposed via REST API endpoints. The project covers a broad range
Kolors is a generative model implementation for synthesizing photorealistic images from natural language descriptions and visual references. It utilizes a latent diffusion model framework to produce high-fidelity imagery, operating within a compressed latent space to improve generation efficiency and quality. The system functions as a multilingual image generator, interpreting text prompts in multiple languages to produce semantically accurate visual outputs. It includes a custom model training pipeline that uses low-rank adaptation to teach the model specific subjects or artistic styles from
Firefly is a training framework and inference engine for large language models. It functions as a toolkit for pre-training and fine-tuning various open-weight architectures, providing a system for model alignment and parameter-efficient fine-tuning. The project includes utilities for merging adapter weights back into base models to create standalone files. It also provides a model alignment toolkit to format training data according to specific prompt templates, ensuring conversational consistency across different models. The framework supports distributed model training and preference-based
Lorax is a GPU-accelerated inference server and multi-adapter engine designed for serving large language models. It functions as a high-throughput system capable of deploying models via Kubernetes and managing the dynamic swapping of Low-Rank Adaptation adapters per request. The server distinguishes itself through multi-adapter dynamic batching, which allows requests using different adapter weights to be processed in a single GPU forward pass. It employs just-in-time adapter loading and weighted adapter merging to maximize throughput and enable multi-tasking without sacrificing performance.
This project is a cloud-based AI deployment system and latent diffusion model trainer. It provides a framework for launching image generation interfaces and training pipelines on remote GPU infrastructure, specifically serving as a text-to-image model fine-tuner. The system features a specialized training interface for fine-tuning Stable Diffusion models on custom image datasets. It allows for the creation of personalized visual outputs by training models on specific subjects or artistic styles using a small set of reference images. The software covers generative AI deployment, custom style
OpenRLHF is a training framework and alignment library designed for reinforcement learning from human feedback across distributed GPU clusters. It provides tools for aligning large language models and multimodal vision-language models using algorithms such as PPO, GRPO, and DPO. The framework distinguishes itself through a distributed inference engine that overlaps sample rollout with training to increase throughput. It supports scaling to models exceeding 70 billion parameters via parameter sharding and handles long-context sequences through ring-attention sequence parallelism. The project
MochiDiffusion is a local client for Stable Diffusion that functions as an AI image generation studio. It provides a workspace for performing text-to-image, image-to-image, and inpainting tasks, enabling the production of high-resolution images offline using local hardware and neural engine acceleration. The project includes a local model manager for importing, organizing, and converting machine learning models into compatible formats for offline execution. It features a ControlNet integration tool to guide structural composition and spatial layout, alongside a dedicated image upscaler that u
kohya_ss is a graphical user interface and workbench for fine-tuning diffusion models, specifically designed for Stable Diffusion. It provides a suite of tools for training generative AI models, including specialized interfaces for creating Low-Rank Adaptation weights and training ControlNet spatial control networks. The project distinguishes itself through integrated VRAM usage optimization and hardware acceleration, featuring specific support for Intel GPUs via XPU-accelerated libraries. It implements parameter-efficient training methods and memory-saving techniques like gradient checkpoint
This library provides a framework for parameter-efficient fine-tuning, enabling the adaptation of large pretrained models by training only a small subset of parameters. It functions as a distributed model training system and optimization toolkit, designed to reduce the computational and memory requirements typically associated with full model fine-tuning. The project distinguishes itself through a suite of methods for modular adapter composition, including low-rank matrix decomposition and activation-based scaling. It supports the integration of multiple task-specific adapter modules, allowin
This project is a comprehensive toolkit for adapting large language models to the Chinese language, providing a specialized framework for fine-tuning, inference, and local deployment. It serves as a coordinated suite for language-specific adaptation, including tools for expanding tokenizers and implementing retrieval-augmented generation. The project distinguishes itself through a complete pipeline for model adaptation, featuring multilingual tokenizer expansion and a fine-tuning framework that supports instruction-based supervised training and adapter merging. It also includes a dedicated de
Dream Textures is a Stable Diffusion integration for Blender that provides tools for text-to-image generation, depth projection, and node-based processing within a 3D environment. It functions as an AI texture generator capable of producing image textures and concept art from text prompts and scene renders. The system features a depth-to-image projection tool that maps generated imagery onto 3D models using depth data for spatial alignment. It also includes a node-based AI image processor for creating procedural visual effects and a dedicated toolset for AI-assisted inpainting and outpainting
Intel XPU LLM Acceleration Library is a toolkit designed to accelerate large language model inference and finetuning on Intel CPUs, GPUs, and NPUs. It provides a distributed inference engine for scaling models across multiple accelerators, a multimodal model runtime for vision and speech tasks, and a low-bit model quantization tool for converting weights into INT4, FP8, and GGUF formats. The project features a parameter-efficient finetuning framework that enables model adaptation using QLoRA and DPO on Intel hardware. It distinguishes itself by providing specialized optimizations for Intel XP
This project is an on-device AI SDK providing a framework for running large language models, vision models, and speech models locally. It serves as an orchestration layer for local LLM execution, ensuring data privacy and offline availability by utilizing hardware acceleration on the device. The SDK is distinguished by its comprehensive voice and multimodal capabilities, including a coordinated voice pipeline for activity detection, speech-to-text, and text-to-speech synthesis. It also provides a dedicated implementation kit for local retrieval-augmented generation and tools for processing co
OpenVINO is an AI inference engine and model serving platform designed to execute optimized deep learning models across CPUs, GPUs, and NPUs through a unified API. It includes a model optimization toolkit for converting, quantizing, and compressing models from various frameworks, alongside a specialized generative AI runtime for large language models. The project distinguishes itself through a plugin-based hardware acceleration layer that maps neural network operations to vendor-specific drivers. It features advanced execution mechanisms such as continuous batching, speculative decoding, and
mistral.rs is an inference engine for large language models that runs locally and exposes models behind OpenAI and Anthropic-compatible APIs. It serves as a multi-model serving platform, capable of loading several models in a single server process with per-request routing and on-demand loading and unloading. The engine supports multimodal inference, processing text alongside images, video, audio, and speech inputs, and includes a quantized model deployment runtime that reduces memory use and speeds up inference on consumer hardware. The project distinguishes itself through an agentic tool exe
Megatron-LM is a distributed transformer training library and large language model training framework designed to scale models across thousands of GPUs. It functions as a GPU-optimized deep learning toolkit and a scaling engine for mixture-of-experts architectures, enabling the training of models with hundreds of billions of parameters. The project implements multi-dimensional model parallelism, combining tensor, pipeline, data, expert, and context-based workload distribution. It specifically optimizes mixture-of-experts architectures through integrated memory and communication improvements t
VoxCPM is a multilingual speech synthesis system and text-to-speech inference server. It functions as an AI voice cloning tool and a synthetic voice designer, capable of generating natural speech across global languages and regional dialects using a GPU-accelerated audio generator. The project features a speech model fine-tuning framework that supports both full parameter updates and low-rank adaptation for customizing voice characteristics. It enables high-fidelity voice cloning from reference audio, including cross-lingual voice transfer and acoustic environment mimicry, as well as the crea
This project is a language model finetuning framework designed to adapt large language models to specific datasets using supervised fine-tuning and low-rank adaptation. It serves as a distributed training manager that coordinates workloads and synchronizes gradients across multiple processing units to scale performance. The framework includes a specialized toolkit for low-rank adaptation to update a subset of model weights, reducing memory and hardware requirements. It provides capabilities for instruction fine-tuning, domain adaptation, and the optimization of function calling to improve how
ai-toolkit is a diffusion model training toolkit designed for fine-tuning image and video generation models. It functions as a containerized model trainer and GPU training job manager, providing the infrastructure to orchestrate dependencies and manage training processes on remote GPU hardware. The system utilizes low-rank adaptation techniques, including LoRA and LoKr weight optimization, to reduce the hardware requirements for model training. It distinguishes itself through a web-based training controller that allows for the monitoring and modification of hyperparameters, secured by token-b
This project is a quantized fine-tuning framework for large language models. It implements a low-rank adaptation library and a four-bit quantizer to reduce the GPU memory requirements needed to train large models. The framework utilizes four-bit quantization and low-rank adapters to enable model training on consumer-grade hardware. It further reduces the memory footprint through double quantization and a paged optimizer that offloads states to system RAM. The system supports distributed training across multiple GPUs to handle larger parameter scales and includes utilities for custom dataset
h2o-llmstudio is a language model training framework that provides a no-code graphical interface for fine-tuning large language models on custom datasets. It functions as a specialized tool for managing the training lifecycle, from configuring hyperparameters to monitoring performance metrics. The project distinguishes itself through a multi-GPU training orchestrator that distributes workloads via data parallel processing and a low-rank adaptation tool for memory-efficient fine-tuning. It also includes a model evaluation dashboard featuring an interactive chat interface to verify conversation