2 repos
Systems for assessing agent decision-making, action success, and conversation quality through automated scoring and feedback loops.
Explore 2 awesome GitHub repositories matching artificial intelligence & ml · Agent Evaluation Frameworks. Refine with filters or upvote what's useful.
This project is a comprehensive educational resource and knowledge base dedicated to the development and application of large language models and autonomous agentic systems. It provides a structured framework for understanding prompt engineering, context management, and the architectural patterns required to build task
Tracks key performance indicators such as task completion rates to measure the reliability of autonomous agent workflows.
OpenHands is an autonomous agent framework designed for software engineering workflows. It provides a modular platform for orchestrating AI agents that reason, plan, and execute tasks within isolated, containerized development environments. By integrating with standard version control and development tools, the system
Triggers iterative refinement cycles whenever agent output quality falls below defined success thresholds.