WizardLM is a large language model and instruction-tuning framework designed to execute sophisticated coding, mathematical, and conversational tasks. It functions as an AI system for mathematical reasoning and code generation, as well as a synthetic dataset generator used to train other language models. The project is distinguished by its evolutionary instruction tuning, which uses a method to rewrite simple instructions into complex tasks. This process expands training dataset difficulty and produces a high volume of open-domain tasks across various difficulty levels. The system covers capa
GLM-4 is a large language model and fine-tuning framework designed for human-like text production, complex reasoning, and multilingual conversation. It functions as a multimodal system capable of processing high-resolution visual content and as a long-context model designed to analyze documents with a context window of up to one million tokens. The project differentiates itself through a function calling interface that enables AI agent development by connecting the model to external APIs and real-time web browsing. It includes specialized capabilities for generating functional programming cod
StableLM is a pre-trained transformer-based large language model designed for natural language generation and zero-shot inference. It functions as a causal language model that predicts the next token in a sequence to produce human-like text for conversational and creative writing tasks. The model is built as a fine-tunable base, allowing the adaptation of pre-trained weights to specific tasks or styles through custom dataset training and weight regularization. It utilizes rotary positional embeddings and flash-attention to optimize memory usage and processing efficiency during deployment on G
ChatGLM3 is an open-weights large language model designed for bilingual conversational interactions in English and Chinese. It functions as a tool-augmented system capable of calling external functions and executing internal code to resolve complex tasks. The model utilizes four-bit quantization to reduce memory requirements, enabling inference on consumer hardware and diverse processing units including GPUs and CPUs. It features an expanded context window for processing and summarizing long documents and includes a supervised fine-tuning pipeline for adapting the model to specialized domains