30 open-source projects similar to bigscience-workshop/promptsource, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Promptsource alternative.
Promptify is a suite of tools designed for model evaluation, prompt management, token cost tracking, structured extraction, and unified API gateway access. It provides a standardized interface to manage requests and responses across multiple large language model providers. The project features a prompt management platform for engineering and versioning prompts with structured output validation. It includes a dedicated evaluation framework to measure model performance using precision, recall, and f1 scores against labeled datasets, alongside a token cost tracker to monitor the financial expens
Rivet is a visual LLM workflow designer and AI agent orchestration engine. It serves as a development environment for building retrieval augmented generation pipelines and a TypeScript library for embedding visual AI graphs and prompt logic into JavaScript applications. The system differentiates itself through a node-based editor that maps data flow between language models, vector databases, and external APIs. It provides specialized tools for prompt engineering, including interfaces for iterative prompt refinement and A/B testing to improve model response quality. The platform covers a broa
This repository catalogs the system prompts used by Claude Code, organizing them into browsable categories with token-count estimates for each prompt. It functions as both a prompt library browser and a revision tracker, surfacing the size and complexity of individual prompts to support auditing and prompt engineering decisions. The project records prompt revisions by parsing git diffs between versions, capturing additions, removals, and token-count changes in a structured changelog. Token counts are approximated from character length using a fixed heuristic ratio, avoiding the need for API c
LMQL is a programming language and probabilistic interface that blends algorithmic logic with stochastic text generation. It functions as a constraint-guided prompting framework and structured output generator, allowing users to force model responses to adhere to strict formatting and data types. The system distinguishes itself as an inference optimizer that increases token throughput and reduces latency. This is achieved through specialized execution strategies, including tree-based prompt caching and asynchronous batch processing. The project covers a broad range of generation control capa
Metaseq is a transformer sequence modeling toolkit designed for training, fine-tuning, and deploying sequence-to-sequence models using open pre-trained weights. It provides a comprehensive framework for large language model training, including dedicated tools for sequence dataset processing and a standalone inference server for generating text via API requests. The project features specialized utilities for model quantization to reduce parameter precision to eight bits, which lowers memory usage and increases inference speed. It also includes a checkpoint conversion pipeline to transform mode
Original Flan (2021) | The Flan Collection (2022) | Flan 2021 Citation | License
This is a machine learning framework for treating diverse natural language processing tasks as a unified text-to-text problem. It provides a toolkit for pre-training and fine-tuning large-scale transformer models, utilizing a system where both inputs and outputs are formatted as raw text sequences. The framework is distinguished by its distributed training system, which uses mesh-based strategies to scale model weights and training batches across multiple TPU cores. It supports multi-task learning by combining diverse datasets into a single training stream using configurable mixture rates, al
An open-source visual programming environment for battle-testing prompts to LLMs.
OpenChat is a framework for the training, fine-tuning, and deployment of large language models optimized for conversational and mathematical reasoning tasks. It provides a comprehensive lifecycle for these models, ranging from training pipelines and deployment stacks to a web-based chat interface. The project focuses on enabling high-performance model execution on consumer-grade hardware without the need for enterprise-grade accelerators. It includes a production-ready inference server that implements the OpenAI chat completion protocol and utilizes dynamic request batching to optimize hardwa
🎩 Models | 📚 Dataset | 🚀 Quick Start | 👀 Demo | 📝 Citation | 🙏 Acknowledgements
:black_heart: Create and share beautiful images of your prompts
Semantic Versioning for Prompts - Clear, predictable version management specification for AI prompts | Prompt 语义化版本规范
Towards Robust Grounded Language Modeling [DEMO](https://huggingface.co/spaces/luohy/SAIL-7B) | [WEB](https://openlsr.org/sail-7b)
A library for helping developers craft prompts for Large Language Models
New Release We released Adversarial training for both LM pre-training/finetuning and f-divergence.
Official Code Repository for the paper Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-intensive Tasks (NeurIPS 2023).
WizardLM is a large language model and instruction-tuning framework designed to execute sophisticated coding, mathematical, and conversational tasks. It functions as an AI system for mathematical reasoning and code generation, as well as a synthetic dataset generator used to train other language models. The project is distinguished by its evolutionary instruction tuning, which uses a method to rewrite simple instructions into complex tasks. This process expands training dataset difficulty and produces a high volume of open-domain tasks across various difficulty levels. The system covers capa
This repo introduces ExpertLLaMA, a solution to produce high-quality, elaborate, expert-like responses by augmenting vanilla instructions with specialized Expert Identity description. This repo contains: - Brief introduction on the method. - 52k Instruction-Following Expert Data generated by…
This repository contains the Unnatural Instructions dataset. Unnatural Instructions is a dataset of instructions automatically generated by a Large Language model. See full details in the paper: "Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor"
Promptfoo is an evaluation framework designed for testing, benchmarking, and red-teaming language models and agentic workflows. It provides a unified environment to run prompts against multiple providers, allowing developers to systematically validate model outputs against objective assertions, semantic similarity metrics, and custom grading rubrics. The platform distinguishes itself through a provider-agnostic execution layer and a stateful orchestrator capable of simulating multi-turn conversations and complex tool-use trajectories. It includes a dedicated adversarial mutation pipeline that
This repository contains the source code for reproducing the data curation of MUFFIN (Multi-faceted Instructions).
This is the repo for the Code Alpaca project, which aims to build and share an instruction-following LLaMA model for code generation. This repo is fully based on Stanford Alpaca ,and only changes the data used for training. Training approach is the same.
DSPy is a declarative programming framework designed for building complex language model applications. It treats model interactions as modular, composable programs, allowing developers to define task logic through typed class schemas rather than relying on manually written prompts. By organizing workflows into hierarchical, reusable Python objects, the framework enables the construction of sophisticated AI systems that manage state and execution flow independently. The framework distinguishes itself through an automated optimization engine that iteratively refines prompt instructions and few-
OpenPrompt is a prompt learning framework designed to adapt large language models to downstream natural language processing tasks. It provides a comprehensive toolkit for implementing manual, soft, and continuous prompting strategies, allowing models to be refined without updating all underlying parameters. The project is distinguished by its support for both discrete and continuous prompt tuning. It includes a system for injecting trainable soft tokens and embeddings into model inputs via gradient descent, as well as an automatic prompt generation engine that uses beam search and generative
UltraChat is a collection of large-scale conversational datasets and instruction-tuning data designed for training and evaluating generative AI models. It provides structured JSON data consisting of complex, multi-round dialogue sequences intended to refine the performance of large language models in chat tasks. The project focuses on improving reasoning and response quality through a diverse set of interactions across multiple sectors. These datasets are used for supervised fine-tuning and instruction tuning workflows to improve how models follow complex directions and maintain context acros