30 open-source projects similar to bigcode-project/starcoder, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Starcoder alternative.
Qwen-7B is a pretrained causal language model designed for natural language generation, text processing, and complex reasoning tasks. It is available as an instruction-tuned model optimized for conversational interactions and a tool-use model capable of executing function calls and interacting with external APIs. The project provides a quantized version of the model to reduce GPU memory usage and supports the development of autonomous agents that can execute code and perform functions to complete complex goals. The system covers a wide range of capabilities including model fine-tuning throug
This project provides a foundational framework and reference implementation for executing causal language modeling and multimodal reasoning on local systems. It includes a set of core components for managing model assets, a fine-tuning framework, and structural definitions required to instantiate transformer-based architectures. The system is distinguished by its ability to process combined text and image inputs through multimodal transformer models for visual reasoning and document analysis. It also supports the deployment of quantized models, reducing memory footprints through low-precision
Huatuo-Llama-Med-Chinese is a medical large language model specialized in processing and generating natural language text in Chinese. It is an instruction-tuned system designed to answer professional healthcare questions by leveraging a dedicated medical knowledge base. The model integrates structured medical literature and knowledge graphs to ensure clinical accuracy during response generation. It employs knowledge-graph augmented inference to combine structured entity relationships with neural network outputs. The system is developed through domain-specific weight adaptation, cross-lingual
Dolly is an instruction-tuned large language model designed to follow complex natural language directions. It operates as a causal language model that predicts the next token in a sequence to generate coherent conversational responses and perform tasks such as brainstorming, classification, and question answering. The project focuses on the development of models using open datasets suitable for commercial application. It enables the creation of instruction-following models by utilizing curated collections of human-generated instruction-response pairs. The repository provides capabilities for
SmolLM is a project dedicated to the development of small language models. It focuses on training and fine-tuning compact models that maintain high performance while utilizing fewer parameters. The project emphasizes efficient AI inference and on-device text generation, aiming to enable the deployment of lightweight models on edge devices with limited memory and processing power. It utilizes synthetic data generation to produce artificial datasets that improve the reasoning and training of these AI systems. The system supports a variety of optimization and training capabilities, including we
Torchtune is a PyTorch-native library for fine-tuning, aligning, and quantizing large language models. It provides a config-driven system for instantiating components, orchestrating distributed training, and managing parameter-efficient fine-tuning with quantization support, all through YAML-based configurations and command-line overrides. The library distinguishes itself through its comprehensive post-training workflow orchestration, combining supervised fine-tuning, preference optimization (DPO, PPO, GRPO), knowledge distillation, and quantization-aware training in a single configurable pip
LLaMA-Adapter is a parameter-efficient fine-tuning framework designed to adapt large language models using a minimal set of trainable parameters. It functions as an instruction tuning tool and a multimodal adapter, allowing pre-trained models to follow human instructions and process non-textual data. The project specializes in the integration of image, video, audio, and sensor data into language models for cross-modal understanding. It enables the customization of LLaMA models through the use of lightweight adapters, which allows for the extraction and storage of learned weights independently
ERNIE is a development toolkit for training, fine-tuning, and deploying large language models built on the PaddlePaddle deep learning platform. It provides a comprehensive suite of core components, including an inference server for vision and language models, a training and fine-tuning toolkit, and a framework for building retrieval-augmented generation systems using private knowledge bases. The project features multimodal AI models capable of reasoning across text, images, and video to perform complex visual understanding and information extraction. It distinguishes itself through specialize
This project is a comprehensive toolkit for adapting large language models to the Chinese language, providing a specialized framework for fine-tuning, inference, and local deployment. It serves as a coordinated suite for language-specific adaptation, including tools for expanding tokenizers and implementing retrieval-augmented generation. The project distinguishes itself through a complete pipeline for model adaptation, featuring multilingual tokenizer expansion and a fine-tuning framework that supports instruction-based supervised training and adapter merging. It also includes a dedicated de
nlp-recipes is a collection of implementation guides and reference templates for applying natural language processing techniques to real-world tasks. It provides standardized workflows and code examples for developing NLP pipelines, from dataset preparation and model training to performance evaluation. The project focuses on the practical application of transformer-based models, offering patterns for fine-tuning pretrained architectures for tasks such as text classification, named entity recognition, and question answering. It also includes a toolkit for model interpretability, allowing users
Chinese-Vicuna is a Chinese large language model and instruction-following AI based on the LLaMA architecture. It is specifically designed for natural language understanding and generation in the Chinese language, utilizing an instruction-tuned model to follow complex user prompts across conversations. The project provides a LoRA fine-tuning framework and quantization systems to enable model adaptation and inference on consumer hardware. It implements quantized inference to reduce memory usage on both CPUs and GPUs, supported by a low-level C++ implementation to minimize system resource requi
This project provides a Chinese large language model based on the LLaMA architecture. It is an instruction-tuned model optimized for natural language processing and multi-turn conversations in Chinese. The system includes a framework for parameter-efficient fine-tuning using low-rank adaptation and quantization to reduce memory requirements. It also implements retrieval augmented generation for local document question answering and supports long-context processing for sequences up to 64K tokens. The project covers a broad set of capabilities including supervised instruction tuning, reinforce
UltraChat is a collection of large-scale conversational datasets and instruction-tuning data designed for training and evaluating generative AI models. It provides structured JSON data consisting of complex, multi-round dialogue sequences intended to refine the performance of large language models in chat tasks. The project focuses on improving reasoning and response quality through a diverse set of interactions across multiple sectors. These datasets are used for supervised fine-tuning and instruction tuning workflows to improve how models follow complex directions and maintain context acros
CodeLlama is a family of large language models derived from the Llama 2 architecture and specialized for producing, completing, and refactoring source code across multiple programming languages. It functions as a code generation model capable of synthesizing source code from natural language descriptions. The project includes specific model variants designed for different programming tasks. This includes instruction-tuned models trained to follow complex natural language directions and code infilling models that predict and insert missing code segments into existing files by analyzing surroun
OpenRLHF is a training framework and alignment library designed for reinforcement learning from human feedback across distributed GPU clusters. It provides tools for aligning large language models and multimodal vision-language models using algorithms such as PPO, GRPO, and DPO. The framework distinguishes itself through a distributed inference engine that overlaps sample rollout with training to increase throughput. It supports scaling to models exceeding 70 billion parameters via parameter sharding and handles long-context sequences through ring-attention sequence parallelism. The project
Firefly is a training framework and inference engine for large language models. It functions as a toolkit for pre-training and fine-tuning various open-weight architectures, providing a system for model alignment and parameter-efficient fine-tuning. The project includes utilities for merging adapter weights back into base models to create standalone files. It also provides a model alignment toolkit to format training data according to specific prompt templates, ensuring conversational consistency across different models. The framework supports distributed model training and preference-based
MiniCPM is a collection of small language models designed for local, on-device deployment in resource-constrained environments. The project focuses on running dense Transformer models on consumer hardware, including GPUs, CPUs, and Apple Silicon, without requiring custom code forks. The project distinguishes itself through heavy optimization for edge hardware, utilizing quantized weight compression in GGUF and MLX formats to reduce memory overhead. It implements advanced inference techniques such as speculative sampling and radix-tree prefix caching to accelerate generation speed and throughp
ESPnet is a comprehensive speech processing toolkit and PyTorch-based trainer designed for building end-to-end speech recognition, synthesis, and translation models. It provides a structured framework for developing automatic speech recognition systems using transducer and encoder-decoder architectures, alongside engines for text-to-speech synthesis and speech translation pipelines. The project distinguishes itself through a recipe-based workflow execution system that ensures experimental reproducibility by running standardized sequences of scripts for data preparation and model training. It
LMFlow is a comprehensive suite for large language model fine-tuning, context extension, multimodal processing, and inference execution. It provides a toolkit for updating model parameters through full tuning or memory-efficient adapter algorithms, alongside an inference engine for executing tuned models via command-line or web-based interfaces. The framework includes a dedicated alignment suite for supervised tuning and reward model training to refine model behavior. It features a context window extender to increase maximum input lengths and a multimodal framework for building chatbots that
LaWGPT is an open-source large language model fine-tuned specifically for the Chinese legal domain. It is designed to understand and respond to Chinese legal queries, generate legally accurate text, and assist with judicial exam preparation by answering questions from the Chinese National Judicial Examination. The project distinguishes itself through a comprehensive adaptation pipeline that extends a base Chinese-language LLaMA model with legal-domain vocabulary and fine-tunes it on large-scale Chinese legal corpora and curated question-answer pairs. It employs parameter-efficient tuning tech
Intel XPU LLM Acceleration Library is a toolkit designed to accelerate large language model inference and finetuning on Intel CPUs, GPUs, and NPUs. It provides a distributed inference engine for scaling models across multiple accelerators, a multimodal model runtime for vision and speech tasks, and a low-bit model quantization tool for converting weights into INT4, FP8, and GGUF formats. The project features a parameter-efficient finetuning framework that enables model adaptation using QLoRA and DPO on Intel hardware. It distinguishes itself by providing specialized optimizations for Intel XP
This repository is a collection of frameworks and guides for Llama models, functioning as a fine-tuning framework, an inference pipeline, and an AI workflow orchestrator. It provides tools for adapting large language models to specific datasets and domains. The project includes a parameter-efficient fine-tuning toolkit that utilizes techniques like low-rank adaptation to reduce memory and compute requirements. It also serves as an implementation guide for retrieval-augmented generation, combining model inference with external data retrieval to improve response accuracy. The capability surfac
Swift is a toolkit for the full-parameter and parameter-efficient fine-tuning of large language and multimodal models. It functions as a multimodal model trainer for text, image, video, and audio data, and includes specialized tools for model compression and reinforcement learning from human feedback. The framework provides an alignment toolkit for optimizing model behavior using preference learning algorithms and reinforcement learning. It integrates parameter-efficient fine-tuning methods to adapt models with minimal memory and compute requirements, alongside utilities for reducing hardware
DeepSeek-Coder is a large language model and foundational neural network architecture designed specifically for software development tasks. It functions as an artificial intelligence assistant capable of interpreting complex programming instructions to generate, transpile, and structure source code. The system distinguishes itself through its ability to perform project-level code generation, analyzing broader context and patterns across entire software projects rather than isolated files. It supports multimodal input processing, allowing for the integration of text and visual data to inform i
Oumi is a comprehensive large language model development platform designed for synthesizing data, fine-tuning models, and running performance evaluations. It serves as a unified environment for the entire model lifecycle, encompassing a training and fine-tuning suite, an evaluation framework, and tools for synthetic data generation and model distillation. The platform is distinguished by its iterative, failure-driven synthesis approach, which analyzes model weaknesses during evaluation to generate targeted training data. It utilizes an LLM-based judge framework to programmatically score respo
Tinker Cookbook is an open-source framework for fine-tuning large language models, supporting supervised learning, reinforcement learning, and parameter-efficient techniques like LoRA adapters. It provides a complete pipeline for aligning models with human preferences through multi-stage RLHF workflows, from supervised fine-tuning through preference optimization to reinforcement learning. The framework distinguishes itself through recipe-based training orchestration, where fine-tuning workflows are defined as composable recipe files that chain data loading, model configuration, and training l
This project provides a comprehensive collection of educational resources and technical guides for training, fine-tuning, and deploying machine learning models using PyTorch and Hugging Face. It serves as a practical reference for scaling deep learning workflows, offering structured instructions for managing large-scale architectures across distributed hardware accelerators. The repository distinguishes itself by focusing on the end-to-end lifecycle of large language models, specifically emphasizing containerized deployment and performance optimization. It details workflows for parameter-effi
BigDL is a PyTorch acceleration framework and distributed inference engine designed for large language models. It provides a toolkit for running models on Intel hardware, integrating quantization tools and libraries for parameter-efficient fine-tuning. The project distinguishes itself through the use of pipeline parallelism to distribute model workloads across multiple hardware accelerators. It utilizes low-bit integer quantization and speculative decoding to reduce memory footprints and decrease text generation latency. The system covers broad capabilities in model optimization, including w
This is a collection of Jupyter notebooks that serve as educational guides for training, fine-tuning, and deploying machine learning models within the Hugging Face ecosystem. The notebooks cover the full lifecycle of model development, from loading and configuring pre-trained transformers to packaging trained models for real-time inference via scalable endpoints. The notebooks demonstrate a range of capabilities including diffusion model training and fine-tuning for image generation and editing, transformer model adaptation for natural language processing tasks, and parameter-efficient fine-t
This project is an educational course and learning curriculum for implementing and fine-tuning transformer models using the Hugging Face ecosystem. It serves as a structured guide and technical walkthrough for processing multimodal data, adapting pre-trained neural networks, and deploying models. The material includes a guide for managing, versioning, and distributing model weights and datasets through a centralized asset hub. It also provides a practical tutorial on adapting models to specific datasets using parameter-efficient methods and an implementation guide for solving natural language