30 open-source projects similar to modelscope/swift, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Swift alternative.
xtuner is a comprehensive training engine for large language models, offering a toolkit for pre-training, supervised fine-tuning, and the optimization of vision-language multimodal models. It serves as a distributed training accelerator and a specialized framework for scaling Mixture-of-Experts models and aligning model behavior through reinforcement learning from human feedback. The project distinguishes itself through advanced memory and compute optimizations, such as sequence parallelism for ultra-long context windows and interleaved pipeline parallelism to reduce GPU idle time. It provide
Torchtune is a PyTorch-native library for fine-tuning, aligning, and quantizing large language models. It provides a configurable training pipeline orchestrated through YAML recipes, with CLI overrides and component swapping, distributed training via FSDP2, memory optimizations, and parameter-efficient fine-tuning methods like LoRA, DoRA, and QLoRA. The library distinguishes itself through its YAML-driven configuration system that defines all training parameters and instantiates components from config files, with full CLI override capability for any field or component at launch time. It suppo
Firefly is a training framework and inference engine for large language models. It functions as a toolkit for pre-training and fine-tuning various open-weight architectures, providing a system for model alignment and parameter-efficient fine-tuning. The project includes utilities for merging adapter weights back into base models to create standalone files. It also provides a model alignment toolkit to format training data according to specific prompt templates, ensuring conversational consistency across different models. The framework supports distributed model training and preference-based
PaddleNLP is a development library and toolkit for training, fine-tuning, and deploying large and small language models using the PaddlePaddle framework. It provides a comprehensive suite for the entire natural language processing lifecycle, from model development to high-performance inference. The project features a standardized model zoo for loading and managing pre-trained models and tokenizers through a unified interface. It distinguishes itself with a specialized model compression framework that reduces memory footprints via weight precision conversion and lossless size optimization, alo
OpenRLHF is a training framework and alignment library designed for reinforcement learning from human feedback across distributed GPU clusters. It provides tools for aligning large language models and multimodal vision-language models using algorithms such as PPO, GRPO, and DPO. The framework distinguishes itself through a distributed inference engine that overlaps sample rollout with training to increase throughput. It supports scaling to models exceeding 70 billion parameters via parameter sharding and handles long-context sequences through ring-attention sequence parallelism. The project
This project is a fine-tuning framework and training pipeline designed to optimize and adapt large language and vision models. It provides a specialized toolkit for parameter-efficient tuning and supervised learning, serving as both a trainer for multimodal models and a deployment tool for serving fine-tuned models via high-performance inference engines. The framework focuses on reducing memory and compute requirements by updating a small subset of model parameters. It supports a wide range of adaptation strategies, including vision-language model training to align text, image, video, and aud
Axolotl is a distributed training orchestrator and fine-tuning framework for large language models, multimodal systems, and quantized models. It provides a structured environment for specializing pre-trained models through full parameter updates or low-rank adaptation, as well as aligning model outputs with human expectations via preference tuning pipelines and reward modeling. The system distinguishes itself through a configuration-driven pipeline that manages preprocessing and training workflows via a single file for reproducibility. It implements high-throughput optimizations such as multi
LLaMA-Factory is a comprehensive suite for dataset preparation, model fine-tuning, memory optimization, and standardized API deployment. It provides a unified platform for the supervised and reward-based fine-tuning of large language models and vision-language models. The framework includes a specialized toolkit for training vision-language models and a model serving interface that deploys trained models through high-performance APIs. It utilizes precision tuning and quantization techniques to reduce the hardware requirements and memory footprint of large models. The system covers data pipel
This project is a comprehensive technical course study guide and reference for learning the architectures and training methods of Transformers and large language models. It serves as a technical overview for understanding how neural networks process data and how to align model behavior with specific performance goals. The repository provides specialized guides on several key areas of model development. This includes detailed references for transformer architectures, implementation frameworks for retrieval-augmented generation and agentic workflows, and technical guides for model optimization
ERNIE is a development toolkit for training, fine-tuning, and deploying large language models built on the PaddlePaddle deep learning platform. It provides a comprehensive suite of core components, including an inference server for vision and language models, a training and fine-tuning toolkit, and a framework for building retrieval-augmented generation systems using private knowledge bases. The project features multimodal AI models capable of reasoning across text, images, and video to perform complex visual understanding and information extraction. It distinguishes itself through specialize
This project provides a Chinese large language model based on the LLaMA architecture. It is an instruction-tuned model optimized for natural language processing and multi-turn conversations in Chinese. The system includes a framework for parameter-efficient fine-tuning using low-rank adaptation and quantization to reduce memory requirements. It also implements retrieval augmented generation for local document question answering and supports long-context processing for sequences up to 64K tokens. The project covers a broad set of capabilities including supervised instruction tuning, reinforce
Torchtune is a PyTorch-native library for fine-tuning, aligning, and quantizing large language models. It provides a config-driven system for instantiating components, orchestrating distributed training, and managing parameter-efficient fine-tuning with quantization support, all through YAML-based configurations and command-line overrides. The library distinguishes itself through its comprehensive post-training workflow orchestration, combining supervised fine-tuning, preference optimization (DPO, PPO, GRPO), knowledge distillation, and quantization-aware training in a single configurable pip
PocketFlow is an integrated toolkit for deep learning model compression, distributed training, and mobile format optimization. It provides a system for reducing the size and complexity of neural networks to improve inference efficiency, featuring a dedicated engine for knowledge distillation and a mobile model optimizer. The framework differentiates itself through an automated hyperparameter tuning system that uses reinforcement learning and statistical models to determine optimal compression ratios and layer-wise bit allocation. It also includes a distributed training system that utilizes mu
SmolLM is a project dedicated to the development of small language models. It focuses on training and fine-tuning compact models that maintain high performance while utilizing fewer parameters. The project emphasizes efficient AI inference and on-device text generation, aiming to enable the deployment of lightweight models on edge devices with limited memory and processing power. It utilizes synthetic data generation to produce artificial datasets that improve the reasoning and training of these AI systems. The system supports a variety of optimization and training capabilities, including we
Metaseq is a transformer sequence modeling toolkit designed for training, fine-tuning, and deploying sequence-to-sequence models using open pre-trained weights. It provides a comprehensive framework for large language model training, including dedicated tools for sequence dataset processing and a standalone inference server for generating text via API requests. The project features specialized utilities for model quantization to reduce parameter precision to eight bits, which lowers memory usage and increases inference speed. It also includes a checkpoint conversion pipeline to transform mode
LitGPT is a training and deployment framework for large language models, providing a suite of tools for pretraining, finetuning, quantizing, evaluating, and serving models within a production environment. It includes a dedicated training pipeline for adapting pretrained models to specific tasks, a quantization tool for reducing weight precision, and an inference server for hosting models via web interfaces. The framework supports high-performance model development through custom architecture implementation and the use of predefined recipes to standardize pretraining and finetuning. It enables
Composer is a PyTorch distributed training framework designed for scaling large-scale models across multi-node GPU clusters. It functions as a large language model trainer, a distributed model optimizer, and a training lifecycle manager. The project differentiates itself as a deep learning regularization library, providing specialized optimization techniques such as Sharpness Aware Minimization, MixUp, and CutMix to improve model generalization. It further distinguishes its training flow through the use of sequence length warmup, progressive layer freezing, and sharded-state checkpointing for
llm-foundry is a training framework for large language models, providing a system for foundation model pre-training and supervised fine-tuning. It includes a distributed trainer for scaling workloads across multiple nodes and GPUs, a dataset streaming pipeline for loading data from cloud storage, and a parameter-efficient fine-tuning implementation. The framework distinguishes itself through its use of parameter sharding and high-throughput data streaming to maintain stability during large-scale training. It incorporates low-rank adaptation to reduce computational costs and uses eight-bit flo
Unsloth is a high-performance training and inference platform designed to optimize the lifecycle of large language and multimodal models. It provides a comprehensive engine for fine-tuning, executing, and managing models locally, with a focus on reducing memory consumption and increasing compute speed on consumer-grade hardware. The platform distinguishes itself through hand-optimized kernels and automated computational graph techniques that maximize hardware throughput. It supports advanced training methodologies, including reinforcement learning for reasoning and efficient adapter-based fin
This project is a multimodal model trainer and machine learning fine-tuning tool that provides a containerized workflow for adapting pre-trained models to specific tasks. It features a no-code web interface and a dashboard for training large language models and other machine learning datasets without writing code. The system distinguishes itself by integrating a no-code interface with remote GPU orchestration, allowing users to deploy containerized training environments on cloud infrastructure or local hardware. It includes a dedicated integrator for uploading trained model weights and config
BigDL is a PyTorch acceleration framework and distributed inference engine designed for large language models. It provides a toolkit for running models on Intel hardware, integrating quantization tools and libraries for parameter-efficient fine-tuning. The project distinguishes itself through the use of pipeline parallelism to distribute model workloads across multiple hardware accelerators. It utilizes low-bit integer quantization and speculative decoding to reduce memory footprints and decrease text generation latency. The system covers broad capabilities in model optimization, including w
AISystem is a comprehensive AI full-stack infrastructure project covering the entire pipeline from AI chip architecture to high-level training frameworks. It encompasses the development of AI compiler frameworks, inference engines, and distributed training orchestrators designed to coordinate workloads across a heterogeneous compute stack of CPUs, GPUs, and NPUs. The project focuses on the deep integration of software and hardware, employing software-hardware co-design to align tensor layouts with physical memory structures. It provides specialized capabilities for accelerating Transformer mo
This project is a distributed training infrastructure designed for aligning large language models through reinforcement learning. It functions as an end-to-end engine for complex alignment tasks, including proximal policy optimization, direct preference optimization, and iterative self-play. By providing a unified framework for multi-turn interactions and tool-use scenarios, it enables the development of models capable of reasoning and external environment engagement. The framework distinguishes itself through a decoupled architecture that separates model training from sample generation. This
This project is a comprehensive learning resource and set of demonstrations focused on large language model integration, deployment, and fine-tuning. It provides educational content and practical guides for working with artificial intelligence models. The resource includes specific tutorials and courses on adapting pre-trained models to specialized datasets using parameter-efficient fine-tuning techniques. It also provides instructional content for running quantized models on consumer hardware and building retrieval augmented generation pipelines using vector databases and document indexing.
This project is a comprehensive educational curriculum and structured learning path covering the full lifecycle of large language models. It provides a guided progression through the theory, architecture, training, and deployment of these models. The curriculum includes specialized guides on transformer architecture, model training tutorials, and frameworks for designing autonomous agents. It also provides dedicated resources for studying model safety and ethics. The material covers a wide range of technical capabilities, including distributed training strategies, parameter-efficient fine-tu
AutoGluon is an automated machine learning framework and multimodal library designed to automate the end-to-end pipeline from data preprocessing to high-accuracy model training and validation. It functions as an automated model trainer for tabular, image, text, and time series data, as well as a tool for time series forecasting and foundation model finetuning. The project is distinguished by its ability to jointly process and fuse different data types, allowing for the construction of multimodal neural networks that integrate images, text, and structured tables. It supports zero-shot inferenc
This project provides a comprehensive collection of educational resources and technical guides for training, fine-tuning, and deploying machine learning models using PyTorch and Hugging Face. It serves as a practical reference for scaling deep learning workflows, offering structured instructions for managing large-scale architectures across distributed hardware accelerators. The repository distinguishes itself by focusing on the end-to-end lifecycle of large language models, specifically emphasizing containerized deployment and performance optimization. It details workflows for parameter-effi
This project is a comprehensive toolkit for adapting large language models to the Chinese language, providing a specialized framework for fine-tuning, inference, and local deployment. It serves as a coordinated suite for language-specific adaptation, including tools for expanding tokenizers and implementing retrieval-augmented generation. The project distinguishes itself through a complete pipeline for model adaptation, featuring multilingual tokenizer expansion and a fine-tuning framework that supports instruction-based supervised training and adapter merging. It also includes a dedicated de
DeepSpeedExamples is a collection of reference implementations and scripts for training, fine-tuning, and executing inference on large-scale AI models using DeepSpeed optimization. It provides a distributed model training guide and practical workflows for adapting large language models through memory-efficient techniques. The repository includes specialized implementations for pipeline parallelism to handle models exceeding single GPU memory and a suite of examples for ZeRO memory optimization to reduce per-device overhead. It also features standardized test suites for benchmarking the throug