awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
LLM Evaluation Harnesses · Awesome GitHub Repositories

1 repo

Awesome GitHub RepositoriesLLM Evaluation Harnesses

Tools for measuring the quality of model outputs using custom metrics and automated judges.

Distinguishing note: Focuses on programmatic evaluation of LLM pipelines, distinct from standard unit testing.

Explore 1 awesome GitHub repository matching testing & quality assurance · LLM Evaluation Harnesses. Refine with filters or upvote what's useful.

  1. Home
  2. Testing & Quality Assurance
  3. LLM Evaluation Harnesses

Awesome LLM Evaluation Harnesses GitHub Repositories

Describe the repository you're looking for…
Find the best repos with AI.We'll search the best matching repositories with AI.
  • stanfordnlp/dspy

    stanfordnlp/dspy

    32,291View on GitHub↗

    DSPy is a declarative programming framework designed for building complex language model applications. It treats model interactions as modular, composable programs, allowing developers to define task logic through typed class schemas rather than relying on manually written prompts. By organizing workflows into hierarchical, reusable Python objects, the framework enables the construction of sophisticated AI systems that manage state and execution flow independently. The framework distinguishes itself through an automated optimization engine that iteratively refines prompt instructions and few-

    Measures output quality using custom metrics and model-based judges to ensure consistent behavior across pipelines.

    Python
    32,291View on GitHub↗