# minimaxir/big-list-of-naughty-strings

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/minimaxir-big-list-of-naughty-strings).**

47,578 stars · 2,163 forks · Python · mit

## Links

- GitHub: https://github.com/minimaxir/big-list-of-naughty-strings
- awesome-repositories: https://awesome-repositories.com/repository/minimaxir-big-list-of-naughty-strings.md

## Description

This project is a curated collection of edge-case strings designed to identify common input validation errors and security vulnerabilities in software applications. It serves as a comprehensive repository of malicious or malformed character sequences intended to trigger unexpected behavior in text-processing systems, database queries, and data parsing logic.

The repository functions as a language-neutral benchmark for input validation and security fuzzing, allowing developers to stress-test sanitization routines across diverse platforms. By maintaining the corpus as a raw text file, the project ensures universal compatibility and straightforward integration into any automated testing pipeline.

The dataset supports broader software security auditing and robustness testing by providing a standardized reference for identifying injection flaws and data handling errors. To facilitate consistent parsing across different programming environments, the collection is available in multiple formats, including plain text, JSON, and Base64.

## Tags

### Testing & Quality Assurance

- [Naughty String Lists](https://awesome-repositories.com/f/testing-quality-assurance/naughty-string-lists.md) — Provides a comprehensive list of problematic strings for testing input validation and security. ([source](https://github.com/minimaxir/big-list-of-naughty-strings#readme))
- [Software Testing Datasets](https://awesome-repositories.com/f/testing-quality-assurance/software-testing-datasets.md) — Provides a curated collection of edge-case strings designed to identify common input validation errors.
- [Input Validation Testing](https://awesome-repositories.com/f/testing-quality-assurance/input-validation-testing.md) — Ensures software applications handle unexpected or malicious user input gracefully.
- [Test Data Sets](https://awesome-repositories.com/f/testing-quality-assurance/test-data-sets.md) — Provides a raw text corpus designed for universal compatibility in automated testing pipelines.
- [Validation Inputs](https://awesome-repositories.com/f/testing-quality-assurance/validation-inputs.md) — Acts as a language-neutral collection of edge-case inputs for testing validation and sanitization logic.
- [Validation Benchmarks](https://awesome-repositories.com/f/testing-quality-assurance/validation-benchmarks.md) — Provides a standardized reference set of problematic text inputs to stress-test data parsing logic.
- [Robustness Testing](https://awesome-repositories.com/f/testing-quality-assurance/robustness-testing.md) — Verifies that data processing pipelines remain stable when encountering unusual or non-standard text formats.
- [Test Corpus Management](https://awesome-repositories.com/f/testing-quality-assurance/test-corpus-management.md) — Tracks and updates a growing collection of problematic strings for automated software testing.

### Security & Cryptography

- [Fuzzing Resources](https://awesome-repositories.com/f/security-cryptography/fuzzing-resources.md) — Provides a comprehensive repository of malicious or malformed character sequences to trigger unexpected behavior.
- [Security Auditing Tools](https://awesome-repositories.com/f/security-cryptography/security-auditing-tools.md) — Identifies potential injection flaws by testing systems against known problematic character sequences.
- [Data Sanitization Strategies](https://awesome-repositories.com/f/security-cryptography/data-sanitization-strategies.md) — Standardizes how different systems clean and escape user-provided text to prevent errors.
