←BackTHUDM/AgentBench0Copy as MarkdownView on GitHub↗3,502 stars·263 forks·Python·Apache-2.0·0 viewsAgentBenchFeaturesEvaluation And Benchmarks - Comprehensive benchmark for evaluating LLM agents across diverse environments.General Agent Benchmarks - Comprehensive evaluation of LLMs as agents.