The official repository for the paper: SEMA: Simple yet Effective Learning for Multi-Turn Jailbreak Attacks.
Ensuring AI safety is crucial as large language models become increasingly integrated into real-world applications. A key challenge is jailbreak, where adversarial prompts bypass built-in safeguards to elicit harmful disallowed outputs. Inspired by psychological foot-in-the-door principles, we…
Code Implementation of "Pattern Enhanced Multi-Turn Jailbreaking: Exploiting Structural Vulnerabilities in Large Language Models"
💥Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues