Setting up
The main features of rapidresponsebench/rapidresponsebench are: Defense Strategies.
Open-source alternatives to rapidresponsebench/rapidresponsebench include: arobey1/smooth-llm — This is the official source code for "SmoothLLM: Defending LLMs Against Jailbreaking Attacks" by Alex Robey, Eric… chuhac/reasoning-to-defend — Code for paper. crystaleye42/eval-safety — This is a repository for replicating the experiments from our paper: Pruning for Protection: Increasing Jailbreak… damo-nlp-sg/multilingual-safety-for-llms — 📄 Paper • 🤗 Dataset. devoallen/indust — We have reorganized INDust, aligning evidence with three types of inductive instructions and implementing stricter… aounon/certified-llm-safety — This repository contains code for the paper Certifying LLM Safety against Adversarial Prompting.
This is the official source code for "SmoothLLM: Defending LLMs Against Jailbreaking Attacks" by Alex Robey, Eric Wong, Hamed Hassani, and George J. Pappas. To learn more about our work, see our blog post.
This is a repository for replicating the experiments from our paper: Pruning for Protection: Increasing Jailbreak Resistance in Aligned LLMs Without Fine-Tuning .
This repository contains code for the paper Certifying LLM Safety against Adversarial Prompting.