awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Agent Evaluation Frameworks · Awesome GitHub Repositories

2 repos

Awesome GitHub RepositoriesAgent Evaluation Frameworks

Systems for assessing agent decision-making, action success, and conversation quality through automated scoring and feedback loops.

Explore 2 awesome GitHub repositories matching artificial intelligence & ml · Agent Evaluation Frameworks. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Agentic Systems Frameworks
  4. Integration and Deployment
  5. Agent Frameworks
  6. Agent Evaluation Frameworks

Awesome Agent Evaluation Frameworks GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • dair-ai/Prompt-Engineering-Guide

    dair-ai/Prompt-Engineering-Guide

    70,526GitHubView on GitHub↗

    This project is a comprehensive educational resource and knowledge base dedicated to the development and application of large language models and autonomous agentic systems. It provides a structured framework for understanding prompt engineering, context management, and the architectural patterns required to build task

    Tracks key performance indicators such as task completion rates to measure the reliability of autonomous agent workflows.

    MDXagentagentsai-agents
  • OpenHands/OpenHands

    OpenHands/OpenHands

    67,974GitHubView on GitHub↗

    OpenHands is an autonomous agent framework designed for software engineering workflows. It provides a modular platform for orchestrating AI agents that reason, plan, and execute tasks within isolated, containerized development environments. By integrating with standard version control and development tools, the system

    Triggers iterative refinement cycles whenever agent output quality falls below defined success thresholds.

    Pythonagentartificial-intelligencechatgpt

Explore sub-tags

  • Agent Performance MetricsQuantitative measures for evaluating the effectiveness and reliability of AI agent workflows.
  • Agent Refinement WorkflowsMechanisms for agents to iteratively review and improve their own output based on quality metrics.
  • Agent Task RefinementMechanisms for automatically iterating on agent prompts or outputs based on performance feedback loops until success criteria are achieved.
Context Validation Frameworks
Methods and tools for evaluating the completeness and accuracy of context provided to AI agents.