Qwen-7B is a pretrained causal language model designed for natural language generation, text processing, and complex reasoning tasks. It is available as an instruction-tuned model optimized for conversational interactions and a tool-use model capable of executing function calls and interacting with external APIs. The project provides a quantized version of the model to reduce GPU memory usage and supports the development of autonomous agents that can execute code and perform functions to complete complex goals. The system covers a wide range of capabilities including model fine-tuning throug
This project provides a foundational framework and reference implementation for executing causal language modeling and multimodal reasoning on local systems. It includes a set of core components for managing model assets, a fine-tuning framework, and structural definitions required to instantiate transformer-based architectures. The system is distinguished by its ability to process combined text and image inputs through multimodal transformer models for visual reasoning and document analysis. It also supports the deployment of quantized models, reducing memory footprints through low-precision
Huatuo-Llama-Med-Chinese is a medical large language model specialized in processing and generating natural language text in Chinese. It is an instruction-tuned system designed to answer professional healthcare questions by leveraging a dedicated medical knowledge base. The model integrates structured medical literature and knowledge graphs to ensure clinical accuracy during response generation. It employs knowledge-graph augmented inference to combine structured entity relationships with neural network outputs. The system is developed through domain-specific weight adaptation, cross-lingual
Dolly is an instruction-tuned large language model designed to follow complex natural language directions. It operates as a causal language model that predicts the next token in a sequence to generate coherent conversational responses and perform tasks such as brainstorming, classification, and question answering. The project focuses on the development of models using open datasets suitable for commercial application. It enables the creation of instruction-following models by utilizing curated collections of human-generated instruction-response pairs. The repository provides capabilities for
Starcoder is a large language model and associated framework designed to generate, complete, and evaluate source code across multiple programming languages. It functions as a source code model that can produce complete function implementations and predict subsequent characters in a line of code based on provided prompts.
The main features of bigcode-project/starcoder are: Code Generators, Generative Code Assistants, Conversational Coding Assistants, Generative Code Models, Model Adaptation Workflows, Large Language Model Fine-Tuning, LLM Fine-Tuning Toolsets, Parameter Efficient Fine-Tuning.
Open-source alternatives to bigcode-project/starcoder include: qwenlm/qwen-7b — Qwen-7B is a pretrained causal language model designed for natural language generation, text processing, and complex… meta-llama/llama-models — This project provides a foundational framework and reference implementation for executing causal language modeling and… scir-hi/huatuo-llama-med-chinese — Huatuo-Llama-Med-Chinese is a medical large language model specialized in processing and generating natural language… databrickslabs/dolly — Dolly is an instruction-tuned large language model designed to follow complex natural language directions. It operates… huggingface/smollm — SmolLM is a project dedicated to the development of small language models. It focuses on training and fine-tuning… meta-pytorch/torchtune — Torchtune is a PyTorch-native library for fine-tuning, aligning, and quantizing large language models. It provides a…