thu-coaiAgent-SafetyBench

0

Agent SafetyBench

The codebase for our paper Agent-SafetyBench: Evaluating the Safety of LLM Agents . Agent-SafetyBench is a comprehensive agent safety evaluation benchmark that introduces a diverse array of novel environments that are previously unexplored, and offers broader and more systematic coverage of…

Features

Evaluation Benchmarks - Evaluates the safety and robustness of LLM-based agents.

Agent SafetyBench

Features

Star history