What are the best open-source GitHub repositories for a toolkit for detecting prompt injection attacks?

0x4m4/hexstrike-ai is the closest match — This platform provides a comprehensive suite for LLM security, including dedicated firewalls, prompt injection prevention, and adversarial testing tools designed to protect AI agents and models.. Other strong matches: protectai/llm-guard, nvidia/garak, tencent/ai-infra-guard, nirdiamant/prompt_engineering.

Why does 0x4m4/hexstrike-ai match “a toolkit for detecting prompt injection attacks”?

This platform provides a comprehensive suite for LLM security, including dedicated firewalls, prompt injection prevention, and adversarial testing tools designed to protect AI agents and models.

Why does protectai/llm-guard match “a toolkit for detecting prompt injection attacks”?

LLM Guard is a comprehensive security firewall and guardrail framework that provides a modular pipeline for detecting prompt injections, validating outputs, and securing LLM interactions, making it a flagship solution for this category.

Why does nvidia/garak match “a toolkit for detecting prompt injection attacks”?

Garak is a comprehensive red teaming and vulnerability scanning framework that identifies prompt injection and other behavioral weaknesses in LLMs, serving as a critical tool for adversarial testing and security auditing.

Why does tencent/ai-infra-guard match “a toolkit for detecting prompt injection attacks”?

This platform provides a comprehensive suite for adversarial testing and red teaming to identify prompt injection and jailbreak vulnerabilities in LLM deployments, serving as a robust tool for AI security auditing.

Why does nirdiamant/prompt_engineering match “a toolkit for detecting prompt injection attacks”?

This repository provides a collection of prompt engineering methodologies and security guardrails, including specific primitives for detecting and preventing prompt injection attacks within LLM workflows.

LLM Prompt Injection Defense Tools

Security frameworks and scanning utilities designed to detect and mitigate prompt injection attacks in LLM applications.

Find the best repos with AI.We'll search the best matching repositories with AI.

0x4m4/hexstrike-ai
0x4m4/hexstrike-ai
9,617View on GitHub
This project is a comprehensive security platform providing an LLM security orchestration framework, an AI agent firewall, and tools for vulnerability remediation, compliance automation, and endpoint protection. It functions as a centralized system to protect AI models from adversarial exploits while managing the identification and patching of software flaws. The platform distinguishes itself through the coordination of specialized AI agents to automate complex security workflows, including reconnaissance, bug hunting, and exploit development. It implements dedicated guardrails to block promp
This platform provides a comprehensive suite for LLM security, including dedicated firewalls, prompt injection prevention, and adversarial testing tools designed to protect AI agents and models.
PythonLLM Prompt Injection PreventionAI Red Teaming
View on GitHub9,617
protectai/llm-guard
protectai/llm-guard
2,561View on GitHub
LLM Guard is a security firewall and guardrail framework designed to scan and sanitize inputs and outputs for large language models. It functions as a proxy gateway and security layer to block prompt injections, toxicity, and sensitive data leakage while ensuring that model interactions remain compliant with organizational policies. The system distinguishes itself through a modular scanner pipeline that utilizes local model orchestration to eliminate external network dependencies. It supports real-time security filtering via streaming chunk analysis and implements a fail-fast execution model
LLM Guard is a comprehensive security firewall and guardrail framework that provides a modular pipeline for detecting prompt injections, validating outputs, and securing LLM interactions, making it a flagship solution for this category.
PythonLLM Prompt Injection PreventionLLM Security
View on GitHub2,561
nvidia/garak
NVIDIA/garak
8,114View on GitHub
Garak is an AI model evaluation tool and vulnerability scanner designed for red teaming large language models and auditing the security of retrieval-augmented generation pipelines. It identifies behavioral weaknesses, such as jailbreaks, hallucinations, and data leakage, by simulating adversarial attacks and executing automated testing vectors. The framework utilizes an adaptive probing loop where prompts can react to previous model behavior and be modified in flight via middleware. To ensure consistent analysis, it employs a provider-agnostic interface to interact with various model APIs and
Garak is a comprehensive red teaming and vulnerability scanning framework that identifies prompt injection and other behavioral weaknesses in LLMs, serving as a critical tool for adversarial testing and security auditing.
PythonAdversarial TestingAI Red Teaming
View on GitHub8,114
tencent/ai-infra-guard
Tencent/AI-Infra-Guard
2,971View on GitHub
AI-Infra-Guard is a security scanning platform designed to detect vulnerabilities across large language model deployments, AI agent skills, and the underlying infrastructure. It functions as a security toolset for auditing source code, evaluating model robustness, and identifying insecure network configurations. The project provides a red teaming framework that uses curated attack datasets to test for jailbreak vulnerabilities and prompt injections. It also includes an infrastructure auditor that employs network fingerprinting and asset discovery to match running components against known comm
This platform provides a comprehensive suite for adversarial testing and red teaming to identify prompt injection and jailbreak vulnerabilities in LLM deployments, serving as a robust tool for AI security auditing.
PythonLLM Security
View on GitHub2,971
nirdiamant/prompt_engineering
NirDiamant/Prompt_Engineering
7,159View on GitHub
This project is a comprehensive guide and framework for designing, optimizing, and securing inputs to improve the accuracy and reasoning of large language model outputs. It provides core methodologies for implementing logical reasoning steps, example-based learning, and reusable template systems. The framework distinguishes itself through a focus on security guardrails and ethical auditing, implementing primitives to prevent adversarial prompt injection attacks and identify biases. It also emphasizes structured generation, using persona assignment and negative constraints to control the tone,
This repository provides a collection of prompt engineering methodologies and security guardrails, including specific primitives for detecting and preventing prompt injection attacks within LLM workflows.
Jupyter NotebookLLM Prompt Injection Prevention
View on GitHub7,159
portkey-ai/gateway
Portkey-AI/gateway
12,091View on GitHub
This project is an artificial intelligence gateway that functions as a centralized middleware layer for managing, securing, and observing interactions with language, vision, and audio models. It provides a unified interface that standardizes requests across multiple providers, enabling teams to integrate AI capabilities into their applications through a consistent set of tools and protocols. The gateway distinguishes itself through its comprehensive infrastructure governance and traffic management capabilities. It allows for policy-driven routing, automated failover, and load balancing across
This project serves as an AI gateway that provides essential security guardrails, sensitive data redaction, and centralized traffic management, making it a robust tool for securing LLM interactions against unauthorized inputs and data leakage.
TypeScriptAI GuardrailsModel Safety Filters
View on GitHub12,091
elder-plinius/cl4r1t4s
elder-plinius/CL4R1T4S
40,356View on GitHub
CL4R1T4S is a framework designed to orchestrate generative AI workflows and optimize language model outputs. It functions as a centralized utility for managing, versioning, and deploying structured system prompts and behavioral parameters to ensure consistent performance across complex tasks. The project distinguishes itself by implementing a structured pipeline that wraps model interactions to enforce behavioral constraints and sanitize inputs. This orchestration layer incorporates heuristic-based validation and stateful context management to maintain coherence and quality throughout multi-s
This framework provides an orchestration layer for LLM workflows that includes input sanitization and behavioral constraints, serving as a tool for managing and securing prompt-based interactions.
LLM Prompt Injection Prevention
View on GitHub40,356
promptfoo/promptfoo
promptfoo/promptfoo
10,529View on GitHub
Promptfoo is an evaluation framework designed for testing, benchmarking, and red-teaming language models and agentic workflows. It provides a unified environment to run prompts against multiple providers, allowing developers to systematically validate model outputs against objective assertions, semantic similarity metrics, and custom grading rubrics. The platform distinguishes itself through a provider-agnostic execution layer and a stateful orchestrator capable of simulating multi-turn conversations and complex tool-use trajectories. It includes a dedicated adversarial mutation pipeline that
This framework provides a robust environment for adversarial red-teaming and automated vulnerability scanning to detect prompt injections, though it functions primarily as an evaluation and testing tool rather than a real-time LLM firewall or runtime mitigation service.
TypeScriptPrompt Injection TestingAI Red Teaming
View on GitHub10,529
rightnow-ai/openfang
RightNow-AI/openfang
17,834View on GitHub
OpenFang is an operating system for LLM agents designed to orchestrate autonomous agents with built-in task scheduling, tool sandboxing, and multi-model routing. It provides a secure AI execution environment that integrates prompt injection scanning, cryptographic audit trails, and resource metering to ensure controlled processing. The platform distinguishes itself through a comprehensive security architecture, featuring fuel-metered tool sandboxing and an immutable activity audit trail based on cryptographic hash-chains. It implements high-assurance identity verification via signed manifests
OpenFang is an agent orchestration framework that integrates prompt injection scanning and secure execution sandboxing as core components of its architecture, making it a relevant tool for defending LLM-powered applications.
RustPrompt Injection Testing
View on GitHub17,834
nearai/ironclaw
nearai/ironclaw
12,456View on GitHub
Ironclaw is an LLM orchestration framework and AI agent gateway designed to connect large language models with external tools, messaging interfaces, and persistent memory systems. It functions as a communication layer that routes interactions between users and AI models via HTTP webhooks and various messaging channels. The system focuses on secure tool execution through a WebAssembly sandbox and isolated containers, which allows the framework to run untrusted code and dynamically generate new tools from natural language descriptions. Security middleware provides prompt injection defense and s
Ironclaw is an LLM orchestration framework that includes built-in security middleware for prompt injection defense, output validation, and credential protection, making it a relevant tool for securing LLM-powered applications.
RustLLM Prompt Injection Prevention
View on GitHub12,456
berriai/litellm
BerriAI/litellm
50,579View on GitHub
LiteLLM is a unified gateway and proxy server designed to centralize access to over one hundred language model providers. It provides a standardized API interface that abstracts vendor-specific schemas, allowing developers to interact with diverse models through a single, consistent format. By acting as a central traffic management layer, it enables organizations to route, secure, and govern model interactions across multiple deployments. The platform distinguishes itself through its policy-driven architecture, which uses configuration-based routing to manage traffic distribution, load balanc
LiteLLM acts as a centralized LLM gateway that provides essential security features like content moderation, secret redaction, and access control, making it a robust infrastructure tool for managing and securing LLM interactions.
PythonModel Safety Filters
View on GitHub50,579
nirdiamant/agents-towards-production
NirDiamant/agents-towards-production
17,375View on GitHub
This project is a comprehensive framework for developing, orchestrating, and deploying autonomous agents. It provides a structured environment for building agents that utilize reasoning loops to perform multi-step tasks, manage state through graph-based workflows, and interact with external tools. By mapping unstructured model outputs into typed schemas, the framework ensures reliable integration with downstream application logic. The platform distinguishes itself through a focus on production-grade reliability and security. It incorporates hybrid memory systems that combine vector embeddings
This framework provides built-in guardrails and input/output validation specifically designed to mitigate injection attacks within autonomous agent workflows, making it a relevant tool for securing LLM-powered applications.
Jupyter NotebookPrompt Injection Testing
View on GitHub17,375
litestar-org/litestar
litestar-org/litestar
8,302View on GitHub
Litestar is a high-performance Python ASGI web framework designed for building asynchronous APIs and web services. It functions as a type-safe toolkit that leverages Python type hints to provide automatic request validation and response serialization, while natively generating interactive API documentation based on the OpenAPI specification. The framework is distinguished by its integrated dependency injection system, which manages shared resources and resolves complex nested service chains directly within request handlers. It further organizes API development through class-based controllers
This is a general-purpose web framework for building APIs, not a specialized security tool designed for detecting or mitigating prompt injection attacks in LLM applications.
PythonAPI SecurityAPI Security Management
View on GitHub8,302

LLM Prompt Injection Defense Tools

0x4m4/hexstrike-ai

protectai/llm-guard

NVIDIA/garak

Tencent/AI-Infra-Guard

NirDiamant/Prompt_Engineering

Portkey-AI/gateway

elder-plinius/CL4R1T4S

promptfoo/promptfoo

RightNow-AI/openfang

nearai/ironclaw

BerriAI/litellm

NirDiamant/agents-towards-production

litestar-org/litestar