This repository is a collection of guides, notebooks, and recipes for implementing advanced prompting techniques and workflow patterns with large language models. It serves as a prompt engineering guide, an evaluation suite for scoring prompt quality, and a framework for orchestrating agents and integrating external tools. The project provides implementation patterns for building applications with Claude, specifically focusing on coordinating multiple models to split complex tasks between high-reasoning and high-efficiency agents. It includes technical demonstrations for multimodal data proce
RAGChecker: A Fine-grained Framework For Diagnosing RAG
A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.
Build - Rapid Experiment - Evaluate - Observability